secret-model-stage-1-4B-512

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1883
  • Centroid Acc: 0.9811
  • Centroid Macro F1: 0.9805
  • Knn Acc: 1.0
  • Knn Macro F1: 1.0
  • Alignment: 0.4756
  • Uniformity: -2.9679
  • Combined Score: 0.9870

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Centroid Acc Centroid Macro F1 Knn Acc Knn Macro F1 Alignment Uniformity Combined Score
No log 0 0 2.2593 0.6792 0.6917 0.9057 0.9100 0.3530 -0.8695 0.7644
1.1652 3.125 100 0.8648 0.9057 0.9109 0.9057 0.9035 0.6337 -2.5491 0.9084
1.0714 6.25 200 0.8627 0.9057 0.9031 0.9623 0.9612 0.4076 -1.9529 0.9225
0.6401 9.375 300 0.5464 0.9245 0.9266 0.9434 0.9458 0.4441 -2.2872 0.9330
0.2931 12.5 400 0.3187 0.9623 0.9634 0.9434 0.9441 0.4535 -2.5949 0.9570
0.3475 15.625 500 0.2551 0.9811 0.9805 0.9811 0.9805 0.4356 -2.5167 0.9805
0.2601 18.75 600 0.2835 1.0 1.0 1.0 1.0 0.4602 -2.7264 1.0
0.1847 21.875 700 0.2680 0.9811 0.9805 0.9623 0.9626 0.4668 -2.7257 0.9745
0.0417 25.0 800 0.2578 0.9811 0.9805 0.9623 0.9590 0.4776 -2.8021 0.9733
0.0417 25.0 800 0.2578 0.9811 0.9805 0.9623 0.9590 0.4776 -2.8021 0.9733
0.0286 28.125 900 0.2974 0.9623 0.9609 0.9811 0.9805 0.5334 -2.9346 0.9674
0.0247 31.25 1000 0.2845 0.9811 0.9805 0.9811 0.9805 0.4813 -2.8991 0.9805
0.0527 34.375 1100 0.2208 0.9811 0.9805 1.0 1.0 0.4827 -2.7601 0.9870
0.0257 37.5 1200 0.2240 0.9434 0.9414 0.9811 0.9805 0.4777 -2.8245 0.9544
0.0135 40.625 1300 0.2672 1.0 1.0 1.0 1.0 0.5095 -2.9311 1.0
0.0158 43.75 1400 0.0937 0.9811 0.9805 1.0 1.0 0.4918 -2.9310 0.9870
0.0237 46.875 1500 0.2464 0.9623 0.9609 1.0 1.0 0.5014 -2.9553 0.9740
0.0044 50.0 1600 0.2580 0.9623 0.9609 0.9811 0.9805 0.5060 -2.9756 0.9674
0.0044 50.0 1600 0.2580 0.9623 0.9609 0.9811 0.9805 0.5060 -2.9756 0.9674
0.0349 53.125 1700 0.1685 0.9811 0.9805 1.0 1.0 0.4694 -2.9311 0.9870
0.0033 56.25 1800 0.1963 0.9811 0.9805 0.9811 0.9805 0.4798 -2.9585 0.9805
0.0407 59.375 1900 0.1823 0.9811 0.9805 1.0 1.0 0.4769 -2.9500 0.9870
0.0121 62.5 2000 0.1934 0.9811 0.9805 1.0 1.0 0.4678 -2.8953 0.9870
0.0029 65.625 2100 0.1388 0.9811 0.9805 0.9811 0.9805 0.4724 -2.9480 0.9805
0.0025 68.75 2200 0.1920 0.9811 0.9805 0.9811 0.9805 0.4795 -2.9718 0.9805
0.0025 71.875 2300 0.1420 0.9811 0.9805 1.0 1.0 0.4727 -2.9765 0.9870
0.0019 75.0 2400 0.1513 0.9811 0.9805 1.0 1.0 0.4682 -2.9583 0.9870
0.0019 75.0 2400 0.1513 0.9811 0.9805 1.0 1.0 0.4682 -2.9583 0.9870
0.0018 78.125 2500 0.1918 0.9811 0.9805 1.0 1.0 0.4728 -2.9570 0.9870
0.002 81.25 2600 0.1485 0.9811 0.9805 1.0 1.0 0.4649 -2.9534 0.9870
0.016 84.375 2700 0.1734 0.9811 0.9805 1.0 1.0 0.4725 -2.9703 0.9870
0.0017 87.5 2800 0.1781 0.9811 0.9805 1.0 1.0 0.4739 -2.9691 0.9870
0.0019 90.625 2900 0.2054 0.9811 0.9805 1.0 1.0 0.4790 -2.9712 0.9870
0.002 93.75 3000 0.1901 0.9811 0.9805 1.0 1.0 0.4759 -2.9673 0.9870
0.0381 96.875 3100 0.1894 0.9811 0.9805 1.0 1.0 0.4757 -2.9680 0.9870
0.002 100.0 3200 0.1883 0.9811 0.9805 1.0 1.0 0.4756 -2.9679 0.9870
0.002 100.0 3200 0.1883 0.9811 0.9805 1.0 1.0 0.4756 -2.9679 0.9870

Framework versions

  • Transformers 4.56.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
1.31M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support