Annotating your own data with 💥 Prodigy

Manually segmenting and labeling audio data is time consuming. For speaker diarization, depending on the required level of precision, it may take more than 10 times the duration of a recording to annotate it.

Table of content

Recipes
Keyboard shortcuts

Recipes

pyannote.audio comes with a bunch of 💥 Prodigy recipes designed to speed things up a bit.

Recipe	Usage
🦻 `pyannote.audio`	Annotate with a pretrained pipeline in the loop
🧐 `pyannote.review`	Merge multiple annotations
🤲 `pyannote.diff`	Show differences between two annotations
🗄 `pyannote.database`	Dump annotations as `pyannote.database` protocols

🦻 `pyannote.audio` | Annotate with a pretrained pipeline in the loop

prodigy pyannote.audio dataset /path/to/audio/directory pyannote/speaker-segmentation

pyannote.audio recipe will stream in .wav files in chunks and apply a pretrained pipeline. You can then adjust the regions manually if needed.

More options

prodigy pyannote.audio [options] dataset source pipeline

  dataset           Prodigy dataset to save annotations to.
  source            Path to directory containing audio files to annotate.
  pipeline          Name of pretrained pipeline on huggingface.co (e.g.
                    pyannote/speaker-segmentation) or path to local YAML file.
  -chunk DURATION   Split audio files into shorter chunks of that many seconds.
                    Defaults to 10s.
  -precision STEP   Temporal precision of keyboard controls, in milliseconds.
                    Defaults to 200ms.
  -beep             Produce a beep when the player reaches the end of a region.

🧐 `pyannote.review` | Merge multiple annotations

prodigy pyannote.review dataset /path/to/audio/directory input1.rttm,input2.rttm

pyannote.review recipe take as many annotation files, using the RTTM file format, as you want and let you compare and choose which ones are best within the same stream as pyannote.audio recipe. Click on a segment of the annotation files to add it to the ouput audio, or on "Input X" to add all segments at once.

More options

prodigy pyannote.review [options] dataset source annotations

  dataset           Prodigy dataset to save annotations to.
  source            Path to directory containing audio files whose annotation is to be checked.
  annotations       Comma-separated paths to annotation files.
  -chunk DURATION   Split audio files into shorter chunks of that many seconds.
                    Defaults to 30s.
  -diarization      Make a optimal one-to-one mapping between the first annotation and the others.
  -precision STEP   Temporal precision of keyboard controls, in milliseconds.
                    Defaults to 200ms.
  -beep             Produce a beep when the player reaches the end of a region.

🤲 `pyannote.diff` | Show differences between two annotations

prodigy pyannote.diff dataset /path/to/audio/directory /path/to/reference.rttm /path/to/hypothesis.rttm

pyannote.diff recipe take one reference file and one hypothesis file, using the RTTM file format, and focus where there are the most errors among missed detections, false alarms and confusions. You can filter on one or more error types and their minimum duration with the corresponding options.

More options

prodigy pyannote.diff [options] dataset source reference hypothesis

  dataset                    Prodigy dataset to save annotations to.
  source                     Path to directory containing audio files whose annotation is to be checked.
  reference                  Path to reference file.
  hypothesis                 Path to hypothesis file.
  -chunk DURATION            Split audio files into shorter chunks of that many seconds.
                             Defaults to 30s.
  -min-duration DURATION     Minimum duration of errors in ms.
                             Defaults to 200ms.
  -diarization               Make a optimal one-to-one mapping between reference and hypothesis.
  -false-alarm               Display false alarm errors.
  -speaker-confusion         Display confusion errors.
  -missed-detection          Display missed detection errors.

🗄 `pyannote.database` | Dump annotations as `pyannote.database` protocols

Work in progress

Keyboard shortcuts

Though pyannote.audio recipes are built on top of the Prodigy audio interface, they provide a bunch of handy additional keyboard shortcuts.

Shortcut	Description
`left` / `right` (+ `w`)	Shift player cursor (speed up)
`up` / `down`	Switch active region
`shift + left` / `shift + right`	Shift active region start time
`ctrl + left` / `ctrl + right`	Shift active region end time
`shift + up`	Create a new region
`shift + down` / `backspace`	Remove active region
`spacebar`	Play/pause player
`escape`	Ignore this sample
`enter`	Validate annotation

RTTM file format

RTTM files contain one line per speech turn, using the following convention:

SPEAKER {uri} 1 {start_time} {duration} <NA> <NA> {speaker_id} <NA> <NA>

uri: file identifier (as given by pyannote.database protocols)
start_time: speech turn start time in seconds
duration: speech turn duration in seconds
confidence: confidence score (can be anything, not used for now)
gender: speaker gender (can be anything, not used for now)
speaker_id: speaker identifier

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prodigy.md

prodigy.md

Annotating your own data with 💥 Prodigy

Table of content

Recipes

🦻 `pyannote.audio` | Annotate with a pretrained pipeline in the loop

🧐 `pyannote.review` | Merge multiple annotations

🤲 `pyannote.diff` | Show differences between two annotations

🗄 `pyannote.database` | Dump annotations as `pyannote.database` protocols

Keyboard shortcuts

RTTM file format

Files

prodigy.md

Latest commit

History

prodigy.md

File metadata and controls

Annotating your own data with 💥 Prodigy

Table of content

Recipes

🦻 pyannote.audio | Annotate with a pretrained pipeline in the loop

🧐 pyannote.review | Merge multiple annotations

🤲 pyannote.diff | Show differences between two annotations

🗄 pyannote.database | Dump annotations as pyannote.database protocols

Keyboard shortcuts

RTTM file format

🦻 `pyannote.audio` | Annotate with a pretrained pipeline in the loop

🧐 `pyannote.review` | Merge multiple annotations

🤲 `pyannote.diff` | Show differences between two annotations

🗄 `pyannote.database` | Dump annotations as `pyannote.database` protocols