Audio Bigrams as a Unifying Model of Pitch-based Song Description
Publication date
2015
Editors
Advisors
Supervisors
DOI
Document Type
Contribution to conference
Metadata
Show full item recordCollections
License
Abstract
In this paper we provide a novel perspective on a family of music description algorithms that perform what could be referred to as `soft' audio fingerprinting. These algorithms convert fragments of musical audio to one or more fixed-size vectors that can be used in distance computation and indexing, not just for traditional audio fingerprinting applications, but also for retrieval of cover songs from a large collection, and corpus-level description of music. We begin with a high-level overview of the algorithms. Next, we identify and formalize an underlying paradigm that allows us to see them as variations of the same model. Finally, we present pytch, a Python implementation of the model that accommodates several of the reviewed algorithms and allows for a variety of applications. The implementation is available online and open to extensions and contributions.
Keywords
Audio fingerprinting, Cover detection, Convolutional neural networks
Citation
Van Balen, J, Wiering, F & Veltkamp, R 2015, 'Audio Bigrams as a Unifying Model of Pitch-based Song Description', Paper presented at 11th International Symposium on Computer Music Multidisciplinary Research (CMMR), Plymouth, United Kingdom, 16/06/15 - 19/06/15., conference