Bochkov commited on
Commit
d51f566
·
verified ·
1 Parent(s): 23c4fd0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -45,14 +45,16 @@ Performance is not comparable to SOTA but shows competitive compositional skills
45
 
46
  For direct benchmarking, see also [Bochkov/demo_bvv_unfrozen_ru] — an identical architecture and dataset, but with standard trainable token embeddings.
47
  Enables seamless fusion/MoE with Bochkov/demo_bvv_zh and Bochkov/demo_bvv_moe (merged model) due to shared embedding space.
48
- Main evaluation
49
- MMLU avg: 22.3% ±0.1
50
- ARC-e: 23.0%
51
- ARC-c: 24.6%
52
- CommonsenseQA: 20.1%
53
- SQUAD: 14.8%
54
- BLEU [en-ru]: 6.4%
55
- BLEU [ru-en]: 8.8%
 
 
56
 
57
  This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
58
 
 
45
 
46
  For direct benchmarking, see also [Bochkov/demo_bvv_unfrozen_ru] — an identical architecture and dataset, but with standard trainable token embeddings.
47
  Enables seamless fusion/MoE with Bochkov/demo_bvv_zh and Bochkov/demo_bvv_moe (merged model) due to shared embedding space.
48
+
49
+ ## Key results
50
+
51
+ - **MMLU avg**: 22.3% ±0.1
52
+ - **ARC-e**: 23.0%
53
+ - **ARC-c**: 24.6%
54
+ - **CommonsenseQA**: 20.1%
55
+ - **SQUAD**: 14.8%
56
+ - **BLEU [en-ru]**: 6.4%
57
+ - **BLEU [ru-en]**: 8.8%
58
 
59
  This work demonstrates that transformer blocks, not token embeddings, carry the semantic burden in LLMs — a step toward modular, fusable, multilingual LMs.
60