Whisper Web UI Ultimate Published With So Many Features - 1-Click to Install Windows, RunPod, Massed Compute, Linux, CUDA 13, Torch 2.9.1 with pre-compiled wheels

#230

by MonsterMMORPG - opened 9 days ago

145395299

This is a massive update with so many new features
- Get the latest zip file and make a fresh install please > https://www.patreon.com/posts/145395299
- 1-Click to install on Windows, RunPod, SimplePod, Massed Compute, Linux
New preset save and load system with extremely well tested best_quality and fast pre-made presets
- Presets are automatically loaded as you change them and also last used preset is remembered when you restart the app
- Word Timestamps is enabled by default to improve quality but it also generates regular version as well automatically
Download transcription button
Open outputs folder button (all transcriptions automatically saved)
Load video / audio file directly from path (useful for platforms like RunPod where Gradio upload is slow)

The fast preset uses new custom in house implemented batch size 32 feature and it is literally blazing fast compared to all other existing Whisper apps and repos
Fully supporting all kind of video and audio formats upload with full preview
Batch folder processing process given folder all files automatically
Live transcription Window that shows latest transcription live while processing
At batch size 1 with best quality, 11x real time transcription speed (depends on GPU)
At batch size 32 fast preset 15x to 30x real time transcription speed (depends on GPU)
New feature Repeat Initial Prompt Every Window

Supports all Whisper models like Large V1, Large V3, Turbo, Distill Large, Tiny, etc
Supports following format outputs you can have checked all so all generated at the same time : SRT, WebVTT, txt, LRC,JSON, TSV
- All outputs will have the same name as your input file name
With sub process working system, you can cancel any processing immediately with 0 RAM or VRAM leak
Fully supports Windows and Linux (use Massed Compute installer)
Based on Python 3.11 VENV and CUDA 13 and Torch 2.9.1 with pre-compiled libraries like Flash Attention
If you don't like output, try to enable / disable Condition On Previous Text it makes big difference

Some other utility features like YouTube, record from a Mic, T2T Translation, BGM Seperation

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment