Smol Urdu TTS V1

Urdu Text-to-Speech model based on SmolLM2-135M.

Note: This is an experimental model and meant as a starting point for the Urdu community. Quality may not be perfect, and improvements are coming soon.

Audio Samples

Below are sample audio outputs generated by Smol Urdu TTS V1 for different Urdu text inputs.

Training Metrics Snapshot

Figure: Training dynamics illustrating loss convergence, learning rate decay, and stable gradient norms during fine-tuning.

Usage

https://github.com/Proxima-AI-Co/smolurdu

Model Details

Base Model: SmolLM2-135M-Instruct
Parameters: 135M
Language: Urdu
Sample Rate: 24kHz
Checkpoint: 120000 steps

Citation

@misc{smol-urdu-tts-v1,
  title={Smol Urdu TTS V1: Urdu Text-to-Speech},
  author={Adnan Zaidi, Mahwiz Khalil},
  year={2024},
  howpublished={HuggingFace Model Repository}
}

Downloads last month: 55

Safetensors

Model size

0.1B params

Tensor type

BF16

Model tree for ProximaAI/smol-urdu-tts-v1

Base model

HuggingFaceTB/SmolLM2-135M

Quantized

HuggingFaceTB/SmolLM2-135M-Instruct

Finetuned

(228)

this model