Pyannote provides pretrained models and pipelines for diarization, voice activity detection, and speaker embeddings. It is widely combined with Whisper to turn raw audio into speaker-labelled transcripts.
Definition
An open toolkit for speaker diarization and related audio tasks, a frequent production choice for figuring out who spoke when.
Pyannote provides pretrained models and pipelines for diarization, voice activity detection, and speaker embeddings. It is widely combined with Whisper to turn raw audio into speaker-labelled transcripts.
Also known as
pyannote.audio