Update README.md
Browse files
README.md
CHANGED
|
@@ -103,6 +103,7 @@ tags:
|
|
| 103 |
- audio
|
| 104 |
- automatic-speech-recognition
|
| 105 |
- hf-asr-leaderboard
|
|
|
|
| 106 |
widget:
|
| 107 |
- example_title: Librispeech sample 1
|
| 108 |
src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
|
|
@@ -159,13 +160,64 @@ base_model:
|
|
| 159 |
- openai/whisper-base
|
| 160 |
---
|
| 161 |
|
| 162 |
-
# Whisper Base FP16
|
| 163 |
|
| 164 |
-
This repository
|
| 165 |
-
and stored in **SafeTensors** format.
|
| 166 |
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
-
|
| 170 |
-
|
| 171 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
- audio
|
| 104 |
- automatic-speech-recognition
|
| 105 |
- hf-asr-leaderboard
|
| 106 |
+
- open4bits
|
| 107 |
widget:
|
| 108 |
- example_title: Librispeech sample 1
|
| 109 |
src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
|
|
|
|
| 160 |
- openai/whisper-base
|
| 161 |
---
|
| 162 |
|
| 163 |
+
# Open4bits / Whisper Base FP16
|
| 164 |
|
| 165 |
+
This repository provides the **Whisper Base model converted to FP16 (float16) precision**, published by Open4bits to enable more efficient inference while maintaining transcription quality.
|
|
|
|
| 166 |
|
| 167 |
+
The underlying Whisper model and architecture are **owned by OpenAI**. This repository contains only a precision-converted version of the original model weights.
|
| 168 |
+
|
| 169 |
+
The model is designed for multilingual speech-to-text tasks and can be used in research, experimentation, and production ASR pipelines.
|
| 170 |
+
|
| 171 |
+
---
|
| 172 |
+
|
| 173 |
+
## Model Overview
|
| 174 |
+
|
| 175 |
+
Whisper is a sequence-to-sequence transformer model developed by OpenAI for automatic speech recognition and speech translation.
|
| 176 |
+
This release uses the **Base** variant and preserves the original architecture while reducing memory usage through FP16 precision.
|
| 177 |
+
|
| 178 |
+
---
|
| 179 |
+
|
| 180 |
+
## Model Details
|
| 181 |
+
|
| 182 |
+
- **Architecture:** Whisper Base
|
| 183 |
+
- **Parameters:** ~74 million
|
| 184 |
+
- **Precision:** float16 (FP16)
|
| 185 |
+
- **Task:** Automatic Speech Recognition (ASR)
|
| 186 |
+
- **Languages:** Multilingual
|
| 187 |
+
- **Weight tying:** Preserved
|
| 188 |
+
- **Compatibility:** Hugging Face Transformers, PyTorch
|
| 189 |
+
|
| 190 |
+
This conversion improves inference speed and lowers VRAM requirements compared to FP32 versions, making it suitable for deployment on consumer and server-grade GPUs.
|
| 191 |
+
|
| 192 |
+
---
|
| 193 |
+
|
| 194 |
+
## Intended Use
|
| 195 |
+
|
| 196 |
+
This model is intended for:
|
| 197 |
+
- Speech-to-text transcription
|
| 198 |
+
- Multilingual ASR applications
|
| 199 |
+
- Research and benchmarking
|
| 200 |
+
- Efficient inference in low-memory environments
|
| 201 |
+
|
| 202 |
+
---
|
| 203 |
+
|
| 204 |
+
## Limitations
|
| 205 |
+
|
| 206 |
+
* Performance depends on audio quality, language, and accent
|
| 207 |
+
* Inherits known limitations of the Whisper Base architecture
|
| 208 |
+
* Not fine-tuned for domain-specific or highly noisy audio
|
| 209 |
+
|
| 210 |
+
---
|
| 211 |
+
|
| 212 |
+
## License
|
| 213 |
+
|
| 214 |
+
This model is released under the **Apache License 2.0**.
|
| 215 |
+
The original Whisper model and associated intellectual property are owned by OpenAI.
|
| 216 |
+
|
| 217 |
+
---
|
| 218 |
+
|
| 219 |
+
## Support
|
| 220 |
+
|
| 221 |
+
If you find this model useful, please consider supporting the project.
|
| 222 |
+
Your support helps us continue releasing and maintaining high-quality open models.
|
| 223 |
+
Support us with a heart.
|