Speaker Segmentation Fine-tuned for Bengali

This model is a fine-tuned version of pyannote/segmentation-3.0 for Bengali speaker diarization.

Model Description

Fine-tuned on Bengali conversational data for improved speaker segmentation performance.

⚠️ Setup Required

You need to add a config.yaml file to use this model.

Copy the config from pyannote/segmentation-3.0/config.yaml and upload it to this repository.

Usage

After adding config.yaml, load the model:

from pyannote.audio import Model

model = Model.from_pretrained("lucius-40/speaker-segmentation-fine-tuned-bengali")

Or use in a diarization pipeline by updating the pipeline's config.yaml:

pipeline:
  params:
    segmentation: lucius-40/speaker-segmentation-fine-tuned-bengali

Training Details

  • Base Model: pyannote/segmentation-3.0
  • Language: Bengali (bn)
  • Training Epochs: 3
  • Batch Size: 2 (effective: 16 with gradient accumulation)
  • Learning Rate: 1e-3
  • Dataset: Custom Bengali speaker diarization dataset

Files in this Repository

  • pytorch_model.bin: Fine-tuned model weights
  • config.yaml: ⚠️ NEEDS TO BE ADDED - Copy from pyannote/segmentation-3.0

Citation

@misc{speaker-segmentation-bengali-2026,
  author = {Lucius},
  title = {Speaker Segmentation Fine-tuned for Bengali},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/lucius-40/speaker-segmentation-fine-tuned-bengali}}
}
Downloads last month
1
Safetensors
Model size
1.47M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support