My VoxCPM2 TTS Model

This model is based on openbmb/VoxCPM2, a multilingual Text-to-Speech model released under the Apache-2.0 license.

This repository contains my duplicated and/or fine-tuned version for building a custom TTS service or application.

Features

  • Multilingual text-to-speech
  • Voice design from text description
  • Voice cloning from reference audio
  • 48kHz audio output
  • Suitable for personal and commercial applications, subject to the Apache-2.0 license and responsible-use restrictions

Usage

from voxcpm import VoxCPM
import soundfile as sf

model = VoxCPM.from_pretrained("ravuthz/VoxCPM2-khmer", load_denoiser=False)

wav = model.generate(
    text="VoxCPM2 αž‚αžΊαž‡αžΆαž€αžΆαžšαž…αŸαž‰αž•αŸ’αžŸαžΆαž™αžŠαŸ‚αž›αž”αžΆαž“αžŽαŸ‚αž“αžΆαŸ†αž”αž…αŸ’αž…αž»αž”αŸ’αž”αž“αŸ’αž“αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαžŸαŸ†αž™αŸ„αž‚αž€αžΆαžšαž“αž·αž™αžΆαž™αž…αŸ’αžšαžΎαž“αž—αžΆαžŸαžΆαžŠαŸ‚αž›αž”αŸ’αžšαžΆαž€αžŠαž“αž·αž™αž˜αŸ”",
    cfg_value=2.0,
    inference_timesteps=10,
)

sf.write("output.wav", wav, model.tts_model.sample_rate)
Downloads last month
141
Safetensors
Model size
2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ravuthz/VoxCPM2-khmer

Base model

openbmb/VoxCPM2
Finetuned
(11)
this model