EQUES
/

OpenRS3-GRPO-ja

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

stardust-eques commited on Apr 4

Commit

a1ac604

·

verified ·

1 Parent(s): 5f53959

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ base_model: SakanaAI/TinySwallow-1.5B-Instruct
 datasets:
 - kunishou/OpenMathInstruct-1-1.8m-ja
 library_name: transformers
-model_name: OpenRS-GRPO-ja
 tags:
 - generated_from_trainer
 - open-r1
@@ -12,7 +12,7 @@ tags:
 licence: license
 ---
-# Model Card for OpenRS-GRPO-ja
 This model is a fine-tuned version of [SakanaAI/TinySwallow-1.5B-Instruct](https://huggingface.co/SakanaAI/TinySwallow-1.5B-Instruct) on the [kunishou/OpenMathInstruct-1-1.8m-ja](https://huggingface.co/datasets/kunishou/OpenMathInstruct-1-1.8m-ja/viewer/default/train?row=0&views%5B%5D=train) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).

 datasets:
 - kunishou/OpenMathInstruct-1-1.8m-ja
 library_name: transformers
+model_name: OpenRS3-GRPO-ja
 tags:
 - generated_from_trainer
 - open-r1
 licence: license
 ---
+# Model Card for OpenRS3-GRPO-ja
 This model is a fine-tuned version of [SakanaAI/TinySwallow-1.5B-Instruct](https://huggingface.co/SakanaAI/TinySwallow-1.5B-Instruct) on the [kunishou/OpenMathInstruct-1-1.8m-ja](https://huggingface.co/datasets/kunishou/OpenMathInstruct-1-1.8m-ja/viewer/default/train?row=0&views%5B%5D=train) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).