Diarization splits audio by speaker so a transcript reads as a conversation rather than a wall of text. It is essential for meeting notes and interviews, and it is hard when voices overlap or sound alike. Pyannote is a common engine.
Definition
Working out who spoke when in an audio stream — labelling segments 'Speaker 1', 'Speaker 2' — without necessarily knowing their names.
Diarization splits audio by speaker so a transcript reads as a conversation rather than a wall of text. It is essential for meeting notes and interviews, and it is hard when voices overlap or sound alike. Pyannote is a common engine.
Also known as
diarization, speaker labelling