i3-architecture Collection Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the bottom. • 7 items • Updated 3 days ago • 2
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models Paper • 2507.17702 • Published Jul 23 • 6