whisper-medium-it-ggml

whisper.cpp GGML quantizations of LocalAI-io/whisper-medium-it for fast CPU/GPU inference.

Author: Ettore Di Giacinto

Brought to you by the LocalAI team. These models can be used directly with LocalAI and any whisper.cpp-based runtime.

Files

File Quantization Description
ggml-model-f16.bin float16 Full precision (no quantization) โ€” highest quality
ggml-model-q8_0.bin int8 8-bit quantization โ€” minimal quality loss
ggml-model-q5_0.bin int5 5-bit quantization โ€” good quality/size tradeoff
ggml-model-q4_0.bin int4 4-bit quantization โ€” smallest size, fastest

Training

Fine-tuned openai/whisper-medium (769M params) on Common Voice 25.0 Italian.

See LocalAI-io/whisper-medium-it for the full safetensors model and detailed WER results.

Usage

whisper.cpp

# Download a quant
huggingface-cli download LocalAI-io/whisper-medium-it-ggml ggml-model-q5_0.bin --local-dir .

# Run
./whisper-cli -m ggml-model-q5_0.bin -f audio.wav -l it

whisper.cpp Python bindings (pywhispercpp)

from pywhispercpp.model import Model

model = Model("ggml-model-q5_0.bin", language="it")
segments = model.transcribe("audio.wav")
for seg in segments:
    print(seg.text)

LocalAI

# In your LocalAI model config
name: whisper-medium-it
backend: whisper
parameters:
  model: ggml-model-q5_0.bin

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LocalAI-io/whisper-medium-it-ggml

Finetuned
(857)
this model