j05hr3d commited on
Commit
66a2154
·
verified ·
1 Parent(s): c20c938

End of training

Browse files
README.md CHANGED
@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [Qwen/Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.9111
23
 
24
  ## Model description
25
 
@@ -48,12 +48,13 @@ The following hyperparameters were used during training:
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_steps: 4
50
  - num_epochs: 1
 
51
 
52
  ### Training results
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:-----:|:----:|:---------------:|
56
- | 1.0173 | 1.0 | 1 | 0.9111 |
57
 
58
 
59
  ### Framework versions
 
19
 
20
  This model is a fine-tuned version of [Qwen/Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.9057
23
 
24
  ## Model description
25
 
 
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_steps: 4
50
  - num_epochs: 1
51
+ - mixed_precision_training: Native AMP
52
 
53
  ### Training results
54
 
55
  | Training Loss | Epoch | Step | Validation Loss |
56
  |:-------------:|:-----:|:----:|:---------------:|
57
+ | 1.0175 | 1.0 | 1 | 0.9057 |
58
 
59
 
60
  ### Framework versions
runs/Oct25_19-23-05_6c6cf53d4df0/events.out.tfevents.1761420433.6c6cf53d4df0.18267.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0d0a6a942d956331f36522926b9a4c5baf04a35c5123ae8627f605a38d7f81d
3
+ size 354