mlx-community/CosyVoice2-0.5B-8bit

This model was converted to MLX format from FunAudioLLM/CosyVoice2-0.5B using mlx-audio-plus version 0.1.2.

Usage

pip install -U mlx-audio-plus

Inference Modes

Mode Parameters Description
Cross-lingual ref_audio Zero-shot TTS (default)
Zero-shot ref_audio + ref_text Better quality with transcription
Instruct ref_audio + instruct_text Style control (e.g., "speak slowly")
Voice Conversion source_audio + ref_audio Convert audio to target voice

Command line

# Cross-lingual (default)
mlx_audio.tts --model mlx-community/CosyVoice2-0.5B-8bit --text "Hello!" --ref_audio ref.wav

# Zero-shot (with transcription)
mlx_audio.tts --model mlx-community/CosyVoice2-0.5B-8bit --text "Hello!" --ref_audio ref.wav --ref_text "Transcription of ref audio."

# Instruct (style control)
mlx_audio.tts --model mlx-community/CosyVoice2-0.5B-8bit --text "Hello!" --ref_audio ref.wav --instruct_text "Speak slowly and calmly"

# Voice Conversion
mlx_audio.tts --model mlx-community/CosyVoice2-0.5B-8bit --source_audio source.wav --ref_audio ref.wav

Python

from mlx_audio.tts.generate import generate_audio

generate_audio(
    text="Hello, this is CosyVoice2 on MLX!",
    model="mlx-community/CosyVoice2-0.5B-8bit",
    ref_audio="reference.wav",
    file_prefix="output",
)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mlx-community/CosyVoice2-0.5B-8bit

Finetuned
(5)
this model