ParoQuant Collection Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 16 items • Updated 5 days ago • 14
Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 7 days ago • 115
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published Feb 9 • 52
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 80 items • Updated 7 days ago • 474
SpecBundle Collection A collection of production-grade draft models for speculative decoding • 17 items • Updated 5 days ago • 17
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 133
ColModernVBERT Collection Resources for ColModernVBERT – the document retrieval–optimized variant of ModernVBERT • 5 items • Updated Oct 3, 2025 • 7
BERT Hash Nano Models Collection Set of BERT models with a modified embeddings layer • 8 items • Updated Jan 29 • 9
CoDA Collection CoDA is Salesforce AI Research's open, lightweight and diffusion-based language model. • 2 items • Updated Oct 31, 2025 • 5
AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing and Data Locality Paper • 2411.05555 • Published Nov 8, 2024 • 6
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights Paper • 2509.22944 • Published Sep 26, 2025 • 81
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 • 218