YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Bengali Speaker Diarization - Segmentation Model V2

Fine-tuned pyannote/segmentation-3.0 for Bengali speaker diarization (DLSPRINT26 hackathon).

Training Details

  • Base Model: pyannote/segmentation-3.0
  • Best Validation Accuracy: 78.04%
  • Best Epoch: 36/50
  • Training Samples: 510
  • Validation Samples: 90

Improvements in V2

  • Dropout: 30% on output layer
  • Weight Decay: 0.01 (L2 regularization)
  • Gradient Clipping: 1.0
  • Label Smoothing: 0.1
  • Learning Rate: OneCycleLR (max=0.0003, 10% warmup)
  • Batch Size: 64

Augmentation

  • Training: Diverse on-the-fly augmentation (noise, reverb, volume, time masking)
  • Validation: Fixed one-time augmentation for consistent evaluation

Usage

from pyannote.audio import Model
import torch

# Load base model
model = Model.from_pretrained("pyannote/segmentation-3.0")

# Load fine-tuned weights
weights = torch.hub.load_state_dict_from_url(
    "https://huggingface.co/smam/pyannote-segmentation-bengali-multilingual-v2/resolve/main/pytorch_model.bin",
    map_location="cpu"
)
model.load_state_dict(weights)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support