My VoxCPM2 TTS Model

This model is based on openbmb/VoxCPM2, a multilingual Text-to-Speech model released under the Apache-2.0 license.

This repository contains my duplicated and/or fine-tuned version for building a custom TTS service or application.

Features

Multilingual text-to-speech
Voice design from text description
Voice cloning from reference audio
48kHz audio output
Suitable for personal and commercial applications, subject to the Apache-2.0 license and responsible-use restrictions

Usage

from voxcpm import VoxCPM
import soundfile as sf

model = VoxCPM.from_pretrained("ravuthz/VoxCPM2-khmer", load_denoiser=False)

wav = model.generate(
    text="VoxCPM2 គឺជាការចេញផ្សាយដែលបានណែនាំបច្ចុប្បន្នសម្រាប់ការសំយោគការនិយាយច្រើនភាសាដែលប្រាកដនិយម។",
    cfg_value=2.0,
    inference_timesteps=10,
)

sf.write("output.wav", wav, model.tts_model.sample_rate)

Downloads last month: 141

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for ravuthz/VoxCPM2-khmer

Base model

openbmb/VoxCPM2

Finetuned

(11)

this model