Whisper Web UI Ultimate Published With So Many Features - 1-Click to Install Windows, RunPod, Massed Compute, Linux, CUDA 13, Torch 2.9.1 with pre-compiled wheels
Whisper-WebUI Premium : https://www.patreon.com/posts/145395299
8 April 2026 - Version 5.0
This is a massive update with so many new features
Get the latest zip file and make a fresh install please > https://www.patreon.com/posts/145395299
1-Click to install on Windows, RunPod, SimplePod, Massed Compute, Linux
New preset save and load system with extremely well tested best_quality and fast pre-made presets
Presets are automatically loaded as you change them and also last used preset is remembered when you restart the app
Word Timestamps is enabled by default to improve quality but it also generates regular version as well automatically
Download transcription button
Open outputs folder button (all transcriptions automatically saved)
Load video / audio file directly from path (useful for platforms like RunPod where Gradio upload is slow)
The fast preset uses new custom in house implemented batch size 32 feature and it is literally blazing fast compared to all other existing Whisper apps and repos
Fully supporting all kind of video and audio formats upload with full preview
Batch folder processing process given folder all files automatically
Live transcription Window that shows latest transcription live while processing
At batch size 1 with best quality, 11x real time transcription speed (depends on GPU)
At batch size 32 fast preset 15x to 30x real time transcription speed (depends on GPU)
New feature Repeat Initial Prompt Every Window
Supports all Whisper models like Large V1, Large V3, Turbo, Distill Large, Tiny, etc
Supports following format outputs you can have checked all so all generated at the same time : SRT, WebVTT, txt, LRC,JSON, TSV
- All outputs will have the same name as your input file name
With sub process working system, you can cancel any processing immediately with 0 RAM or VRAM leak
Fully supports Windows and Linux (use Massed Compute installer)
Based on Python 3.11 VENV and CUDA 13 and Torch 2.9.1 with pre-compiled libraries like Flash Attention
If you don't like output, try to enable / disable Condition On Previous Text it makes big difference
- The app supports 100 languages and 32 models
Lots of Advanced Parameters and all set to best quality
Built in Background Music Remover Filter
Built in Voice Detection Filter
Fully detailed CMD output to watch entire progress
Extremely optimized VRAM usage as low as 6 GB GPUs
- Some other utility features like YouTube, record from a Mic, T2T Translation, BGM Seperation