Voxtral Mini 4B Realtime - 8-bit MLX
This is an 8-bit quantized MLX version of Voxtral Mini 4B Realtime by Mistral AI, converted using voxmlx.
This version was created for use with Supervoxtral, enabling blazingly-fast realtime transcription on MacOS.
Model Details
- Base model: mistralai/Voxtral-Mini-4B-Realtime-2602
- Quantization: 8-bit (group size 64)
- Framework: MLX
- Parameters: ~4B (3.4B language model + 970M audio encoder)
- License: Apache 2.0
Description
Voxtral Mini is a speech-to-text model that supports 13+ languages with sub-500ms latency. This version has been quantized to 8-bit precision for efficient inference on Apple Silicon using the MLX framework.
Credits
- Original model by Mistral AI
- MLX conversion tooling by voxmlx
- Downloads last month
- 45
Model size
1B params
Tensor type
BF16
·
U32
·
Hardware compatibility
Log In
to add your hardware
Quantized
Model tree for ellamind/Voxtral-Mini-4B-Realtime-8bit-mlx
Base model
mistralai/Ministral-3-3B-Base-2512
Finetuned
mistralai/Voxtral-Mini-4B-Realtime-2602