Improve model card: Add metadata and prominent paper/code links
Browse filesThis PR enhances the model card for `Bochkov/best_bvv_ru` by:
- Adding `pipeline_tag: text-generation` to the YAML metadata, correctly categorizing the model for discoverability on the Hugging Face Hub.
- Adding `library_name: transformers` to the YAML metadata, which enables the "Use in Transformers" widget and improves ecosystem integration.
- Adding prominent links to the associated paper, [Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129), and the relevant GitHub repository, [https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking](https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking), at the top of the model card for easy access.
These changes improve the model card's completeness, usability, and discoverability.
|
@@ -8,12 +8,17 @@ tags:
|
|
| 8 |
- conceptual-demo
|
| 9 |
- MoE-ready
|
| 10 |
- transformer
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
# best_bvv_ru
|
| 14 |
|
| 15 |
**Proof-of-concept Transformer LM with frozen, non-semantic token embeddings trained on a small English-Russian corpus.**
|
| 16 |
|
|
|
|
|
|
|
|
|
|
| 17 |
**This model is part of a series of models designed to demonstrate:**
|
| 18 |
- The viability of transformer language models where the embedding layer is precomputed from non-semantic (Unicode/visual) features and entirely _frozen_ during training.
|
| 19 |
- The possibility of modular/federated model fusion (MoE) by combining models with a shared token embedding matrix, without any additional retraining or alignment.
|
|
@@ -98,4 +103,5 @@ outputs = model.generate(
|
|
| 98 |
top_p=0.95,
|
| 99 |
do_sample=True
|
| 100 |
)
|
| 101 |
-
print(tokenizer.decode(outputs[0]))
|
|
|
|
|
|
| 8 |
- conceptual-demo
|
| 9 |
- MoE-ready
|
| 10 |
- transformer
|
| 11 |
+
pipeline_tag: text-generation
|
| 12 |
+
library_name: transformers
|
| 13 |
---
|
| 14 |
|
| 15 |
# best_bvv_ru
|
| 16 |
|
| 17 |
**Proof-of-concept Transformer LM with frozen, non-semantic token embeddings trained on a small English-Russian corpus.**
|
| 18 |
|
| 19 |
+
📚 Paper: [Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129) ([arXiv](https://arxiv.org/abs/2507.07129))
|
| 20 |
+
💻 Code: [https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking](https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking)
|
| 21 |
+
|
| 22 |
**This model is part of a series of models designed to demonstrate:**
|
| 23 |
- The viability of transformer language models where the embedding layer is precomputed from non-semantic (Unicode/visual) features and entirely _frozen_ during training.
|
| 24 |
- The possibility of modular/federated model fusion (MoE) by combining models with a shared token embedding matrix, without any additional retraining or alignment.
|
|
|
|
| 103 |
top_p=0.95,
|
| 104 |
do_sample=True
|
| 105 |
)
|
| 106 |
+
print(tokenizer.decode(outputs[0]))
|
| 107 |
+
```
|