This repository contains code and a sample of the data used in the following paper:
Oudyk, K., Lostanlen, V., Salamon, J., Farnsworth, A., and Bello, J. (2019). Matching human vocal imitations to birdsong: An exploratory analysis. In Proc. 2 nd Intl. Workshop on Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR), London, England.
We explore computational strategies for matching human vocal imitations of birdsong to actual birdsong recordings. We recorded human vocal imitations of birdsong and subsequently analysed these data using three categories of audio features for matching imitations to original birdsong: spectral, temporal, and spectrotemporal. These exploratory analyses suggest that spectral features can help distinguish imitation strategies (e.g. whistling vs. singing) but are insufficient for distinguishing species. Similarly, whereas temporal features are correlated between human imitations and natural birdsong, they are also insufficient. Spectrotemporal features showed the greatest promise, in particular when used to extract a representation of the pitch contour of birdsong and human imitations. This finding suggests a link between the task of matching human imitations to birdsong to retrieval tasks in the music domain such as query-by-humming and cover song retrieval; we borrow from such existing methodologies to outline directions for future research.
@INPROCEEDINGS{oudyk2019matching,
author = {Oudyk, Kendra and Vincent Lostanlen and Justin Salamon and Andrew Farnsworth and Juan Bello},
title = {Matching human vocal imitations to birdsong: An exploratory analysis},
booktitle = {Proc. 2 nd Intl. Workshop on Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR), London, England},
year = {2019},
}
Please help us improve BirdVox-imitation by sending your feedback to:
[email protected] and [email protected]
In case of a problem, please include as many details as possible.
We thank all the participants who anonymously volunteered to make these imitations. We also thank all contributors to the Xeno-Canto community, and in particular the authors of the recordings which are featured in BirdVox-imitation under the form of short excerpts.
This project was supported by the Leon Levy Foundation, the National Science Foundation’s Big Data grant 1633206, and a travel grant from the University of Jyväskylä (KO).