Running pretrained models on batches of data #663

jonrein · 2021-04-19T19:39:30Z

jonrein
Apr 19, 2021

The tutorials give some really nice examples on how to apply a pretrained model to a single test file, but it's not clear to me how to apply a pretrained model to a large set of my own data (parallelized in batches to speed up processing). One of the tutorials (https://github.com/pyannote/pyannote-audio/tree/master/tutorials/pretrained/pipeline) mentions defining one's own protocol and points to the data prep tutorial (https://github.com/pyannote/pyannote-audio/tree/master/tutorials/data_preparation), but that tutorial seems to be mostly related to training. What is required to create a protocol for unannotated test data?

Thanks!

hadware · 2021-04-19T21:41:01Z

hadware
Apr 19, 2021

You should define your own pyannote database protocol (as you'll see in the README, there is a couple of ways to do this), and leave the train and dev sets empty (which is fine for pyannote). If i'm not mistaken, pyannote doesn't care if audio files in the test set don't contain annotations (although, you obviously won't be able to score the performance of the model).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running pretrained models on batches of data #663

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Running pretrained models on batches of data #663

jonrein Apr 19, 2021

Replies: 1 comment

hadware Apr 19, 2021

jonrein
Apr 19, 2021

hadware
Apr 19, 2021