de_wiki_mlm_42

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0796	2000	8.1248
8.1604	2.1592	4000	7.4613
8.1604	3.2389	6000	7.3457
7.3684	4.3185	8000	7.2823
7.3684	5.3981	10000	7.1806
7.2121	6.4777	12000	7.1159
7.2121	7.5574	14000	7.0322
7.0695	8.6370	16000	7.0128
7.0695	9.7166	18000	6.9382
6.9563	10.7962	20000	6.9027
6.9563	11.8758	22000	6.8521
6.8584	12.9555	24000	6.7639
6.8584	14.0351	26000	6.6782
6.6954	15.1147	28000	6.5272
6.6954	16.1943	30000	6.3891
6.4335	17.2740	32000	6.1050
6.4335	18.3536	34000	5.6402
5.7799	19.4332	36000	5.1583
5.7799	20.5128	38000	4.8938
5.0133	21.5924	40000	4.6340
5.0133	22.6721	42000	4.4619
4.5804	23.7517	44000	4.2638
4.5804	24.8313	46000	4.1289
4.2594	25.9109	48000	4.0013
4.2594	26.9906	50000	3.8840
4.0135	28.0702	52000	3.8026
4.0135	29.1498	54000	3.7119
3.8273	30.2294	56000	3.6407
3.8273	31.3090	58000	3.5547
3.6814	32.3887	60000	3.5167
3.6814	33.4683	62000	3.4455
3.56	34.5479	64000	3.4070
3.56	35.6275	66000	3.3657
3.4651	36.7072	68000	3.3225
3.4651	37.7868	70000	3.3028
3.3776	38.8664	72000	3.2467
3.3776	39.9460	74000	3.2158
3.3098	41.0256	76000	3.2050
3.3098	42.1053	78000	3.1554
3.2499	43.1849	80000	3.1305
3.2499	44.2645	82000	3.1254
3.2031	45.3441	84000	3.0903
3.2031	46.4238	86000	3.0811
3.1596	47.5034	88000	3.0615
3.1596	48.5830	90000	3.0489
3.1274	49.6626	92000	3.0343
3.1274	50.7422	94000	3.0256
3.1001	51.8219	96000	3.0236
3.1001	52.9015	98000	3.0081
3.0805	53.9811	100000	3.0027

Safetensors

Model size

14.9M params

Tensor type

F32