Pyannote provides pretrained models and pipelines for diarization, voice activity detection, and speaker embeddings. It is widely combined with Whisper to turn raw audio into speaker-labelled transcripts.