Whisper Web UI Ultimate Published With So Many Features - 1-Click to Install Windows, RunPod, Massed Compute, Linux, CUDA 13, Torch 2.9.1 with pre-compiled wheels

#230
by MonsterMMORPG - opened

Whisper-WebUI Premium : https://www.patreon.com/posts/145395299

8 April 2026 - Version 5.0

  • This is a massive update with so many new features

  • New preset save and load system with extremely well tested best_quality and fast pre-made presets

    • Presets are automatically loaded as you change them and also last used preset is remembered when you restart the app

    • Word Timestamps is enabled by default to improve quality but it also generates regular version as well automatically

  • Download transcription button

  • Open outputs folder button (all transcriptions automatically saved)

  • Load video / audio file directly from path (useful for platforms like RunPod where Gradio upload is slow)

image
  • The fast preset uses new custom in house implemented batch size 32 feature and it is literally blazing fast compared to all other existing Whisper apps and repos

  • Fully supporting all kind of video and audio formats upload with full preview

  • Batch folder processing process given folder all files automatically

  • Live transcription Window that shows latest transcription live while processing

  • At batch size 1 with best quality, 11x real time transcription speed (depends on GPU)

  • At batch size 32 fast preset 15x to 30x real time transcription speed (depends on GPU)

  • New feature Repeat Initial Prompt Every Window

image
  • Supports all Whisper models like Large V1, Large V3, Turbo, Distill Large, Tiny, etc

  • Supports following format outputs you can have checked all so all generated at the same time : SRT, WebVTT, txt, LRC,JSON, TSV

    • All outputs will have the same name as your input file name
  • With sub process working system, you can cancel any processing immediately with 0 RAM or VRAM leak

  • Fully supports Windows and Linux (use Massed Compute installer)

  • Based on Python 3.11 VENV and CUDA 13 and Torch 2.9.1 with pre-compiled libraries like Flash Attention

  • If you don't like output, try to enable / disable Condition On Previous Text it makes big difference

image
  • The app supports 100 languages and 32 models
image image image
  • Lots of Advanced Parameters and all set to best quality

  • Built in Background Music Remover Filter

  • Built in Voice Detection Filter

  • image
  • Fully detailed CMD output to watch entire progress

  • Extremely optimized VRAM usage as low as 6 GB GPUs

image
  • Some other utility features like YouTube, record from a Mic, T2T Translation, BGM Seperation
image

Full Page Screenshot

screencapture-127-0-0-1-7864-2026-04-08-17_22_41

Sign up or log in to comment