Improve model card: Add metadata and prominent paper/code links

This PR enhances the model card for `Bochkov/best_bvv_ru` by:
- Adding `pipeline_tag: text-generation` to the YAML metadata, correctly categorizing the model for discoverability on the Hugging Face Hub.
- Adding `library_name: transformers` to the YAML metadata, which enables the "Use in Transformers" widget and improves ecosystem integration.
- Adding prominent links to the associated paper, [Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129), and the relevant GitHub repository, [https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking](https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking), at the top of the model card for easy access.

These changes improve the model card's completeness, usability, and discoverability.

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -8,12 +8,17 @@ tags:
 - conceptual-demo
 - MoE-ready
 - transformer
 ---
 # best_bvv_ru
 **Proof-of-concept Transformer LM with frozen, non-semantic token embeddings trained on a small English-Russian corpus.**
 **This model is part of a series of models designed to demonstrate:**
 - The viability of transformer language models where the embedding layer is precomputed from non-semantic (Unicode/visual) features and entirely _frozen_ during training.
 - The possibility of modular/federated model fusion (MoE) by combining models with a shared token embedding matrix, without any additional retraining or alignment.
@@ -98,4 +103,5 @@ outputs = model.generate(
     top_p=0.95,
     do_sample=True
 )
-print(tokenizer.decode(outputs[0]))

 - conceptual-demo
 - MoE-ready
 - transformer
+pipeline_tag: text-generation
+library_name: transformers
 ---
 # best_bvv_ru
 **Proof-of-concept Transformer LM with frozen, non-semantic token embeddings trained on a small English-Russian corpus.**
+📚 Paper: [Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate](https://huggingface.co/papers/2507.07129) ([arXiv](https://arxiv.org/abs/2507.07129))
+💻 Code: [https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking](https://github.com/Bochkov/BVV241_Tokenizer_Benchmarking)
 **This model is part of a series of models designed to demonstrate:**
 - The viability of transformer language models where the embedding layer is precomputed from non-semantic (Unicode/visual) features and entirely _frozen_ during training.
 - The possibility of modular/federated model fusion (MoE) by combining models with a shared token embedding matrix, without any additional retraining or alignment.
     top_p=0.95,
     do_sample=True
 )
+print(tokenizer.decode(outputs[0]))
+```