YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Bengali Speaker Diarization - Segmentation Model V2
Fine-tuned pyannote/segmentation-3.0 for Bengali speaker diarization (DLSPRINT26 hackathon).
Training Details
- Base Model: pyannote/segmentation-3.0
- Best Validation Accuracy: 78.04%
- Best Epoch: 36/50
- Training Samples: 510
- Validation Samples: 90
Improvements in V2
- Dropout: 30% on output layer
- Weight Decay: 0.01 (L2 regularization)
- Gradient Clipping: 1.0
- Label Smoothing: 0.1
- Learning Rate: OneCycleLR (max=0.0003, 10% warmup)
- Batch Size: 64
Augmentation
- Training: Diverse on-the-fly augmentation (noise, reverb, volume, time masking)
- Validation: Fixed one-time augmentation for consistent evaluation
Usage
from pyannote.audio import Model
import torch
# Load base model
model = Model.from_pretrained("pyannote/segmentation-3.0")
# Load fine-tuned weights
weights = torch.hub.load_state_dict_from_url(
"https://huggingface.co/smam/pyannote-segmentation-bengali-multilingual-v2/resolve/main/pytorch_model.bin",
map_location="cpu"
)
model.load_state_dict(weights)
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support