secret-model-stage-1-4B-512

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1883
Centroid Acc: 0.9811
Centroid Macro F1: 0.9805
Knn Acc: 1.0
Knn Macro F1: 1.0
Alignment: 0.4756
Uniformity: -2.9679
Combined Score: 0.9870

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.06
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Centroid Acc	Centroid Macro F1	Knn Acc	Knn Macro F1	Alignment	Uniformity	Combined Score
No log	0	0	2.2593	0.6792	0.6917	0.9057	0.9100	0.3530	-0.8695	0.7644
1.1652	3.125	100	0.8648	0.9057	0.9109	0.9057	0.9035	0.6337	-2.5491	0.9084
1.0714	6.25	200	0.8627	0.9057	0.9031	0.9623	0.9612	0.4076	-1.9529	0.9225
0.6401	9.375	300	0.5464	0.9245	0.9266	0.9434	0.9458	0.4441	-2.2872	0.9330
0.2931	12.5	400	0.3187	0.9623	0.9634	0.9434	0.9441	0.4535	-2.5949	0.9570
0.3475	15.625	500	0.2551	0.9811	0.9805	0.9811	0.9805	0.4356	-2.5167	0.9805
0.2601	18.75	600	0.2835	1.0	1.0	1.0	1.0	0.4602	-2.7264	1.0
0.1847	21.875	700	0.2680	0.9811	0.9805	0.9623	0.9626	0.4668	-2.7257	0.9745
0.0417	25.0	800	0.2578	0.9811	0.9805	0.9623	0.9590	0.4776	-2.8021	0.9733
0.0417	25.0	800	0.2578	0.9811	0.9805	0.9623	0.9590	0.4776	-2.8021	0.9733
0.0286	28.125	900	0.2974	0.9623	0.9609	0.9811	0.9805	0.5334	-2.9346	0.9674
0.0247	31.25	1000	0.2845	0.9811	0.9805	0.9811	0.9805	0.4813	-2.8991	0.9805
0.0527	34.375	1100	0.2208	0.9811	0.9805	1.0	1.0	0.4827	-2.7601	0.9870
0.0257	37.5	1200	0.2240	0.9434	0.9414	0.9811	0.9805	0.4777	-2.8245	0.9544
0.0135	40.625	1300	0.2672	1.0	1.0	1.0	1.0	0.5095	-2.9311	1.0
0.0158	43.75	1400	0.0937	0.9811	0.9805	1.0	1.0	0.4918	-2.9310	0.9870
0.0237	46.875	1500	0.2464	0.9623	0.9609	1.0	1.0	0.5014	-2.9553	0.9740
0.0044	50.0	1600	0.2580	0.9623	0.9609	0.9811	0.9805	0.5060	-2.9756	0.9674
0.0044	50.0	1600	0.2580	0.9623	0.9609	0.9811	0.9805	0.5060	-2.9756	0.9674
0.0349	53.125	1700	0.1685	0.9811	0.9805	1.0	1.0	0.4694	-2.9311	0.9870
0.0033	56.25	1800	0.1963	0.9811	0.9805	0.9811	0.9805	0.4798	-2.9585	0.9805
0.0407	59.375	1900	0.1823	0.9811	0.9805	1.0	1.0	0.4769	-2.9500	0.9870
0.0121	62.5	2000	0.1934	0.9811	0.9805	1.0	1.0	0.4678	-2.8953	0.9870
0.0029	65.625	2100	0.1388	0.9811	0.9805	0.9811	0.9805	0.4724	-2.9480	0.9805
0.0025	68.75	2200	0.1920	0.9811	0.9805	0.9811	0.9805	0.4795	-2.9718	0.9805
0.0025	71.875	2300	0.1420	0.9811	0.9805	1.0	1.0	0.4727	-2.9765	0.9870
0.0019	75.0	2400	0.1513	0.9811	0.9805	1.0	1.0	0.4682	-2.9583	0.9870
0.0019	75.0	2400	0.1513	0.9811	0.9805	1.0	1.0	0.4682	-2.9583	0.9870
0.0018	78.125	2500	0.1918	0.9811	0.9805	1.0	1.0	0.4728	-2.9570	0.9870
0.002	81.25	2600	0.1485	0.9811	0.9805	1.0	1.0	0.4649	-2.9534	0.9870
0.016	84.375	2700	0.1734	0.9811	0.9805	1.0	1.0	0.4725	-2.9703	0.9870
0.0017	87.5	2800	0.1781	0.9811	0.9805	1.0	1.0	0.4739	-2.9691	0.9870
0.0019	90.625	2900	0.2054	0.9811	0.9805	1.0	1.0	0.4790	-2.9712	0.9870
0.002	93.75	3000	0.1901	0.9811	0.9805	1.0	1.0	0.4759	-2.9673	0.9870
0.0381	96.875	3100	0.1894	0.9811	0.9805	1.0	1.0	0.4757	-2.9680	0.9870
0.002	100.0	3200	0.1883	0.9811	0.9805	1.0	1.0	0.4756	-2.9679	0.9870
0.002	100.0	3200	0.1883	0.9811	0.9805	1.0	1.0	0.4756	-2.9679	0.9870

Framework versions

Transformers 4.56.0
Pytorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.22.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

1.31M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support