Comparing Audio Boundary Annotation of Vocal Polyphony: Experts, Non-experts, and Algorithms

Publication date

2024-07-06

Authors

Visscher, MirjamORCID 0000-0003-2152-0278
Wiering, F.ORCID 0000-0002-2984-8932ISNI 0000000053360131

Editors

Advisors

Supervisors

Document Type

Contribution to conference
Open Access logo

License

cc_by_nc_nd

Abstract

It is a challenging computational problem to perform segmentation on vocal polyphony from the Renaissance and early Baroque. In this genre, boundaries between segments are often hidden by overlapping voices. To test algorithms for segmentation, we need boundary annotations by humans as a ground truth, but experts in this field are rare and short on time. Our study aims to evaluate the effectiveness of segmentation algorithms on vocal polyphony using both expert and non-expert annotations. For this, we collect boundary annotations by human experts and non-experts on polyphony. Then, we compare the annotations by the two groups to see whether we can use segmentations by non-experts instead of experts. Finally, we use the expert annotations to evaluate different segmentation algorithms from the MSAF library by Nieto and Bello. The results show that the performance of non-experts comes quite close to that of experts, whereas the tested algorithms are not yet able to perform the task at a similar level. We conclude that non-expert annotations are adequate to act as ground truth for evaluating boundary detectors on vocal polyphony and we present next steps to create a larger dataset for such evaluations.

Keywords

Renaissance Music, boundaries, Music perception, cadences, sound and music computing

Citation

Visscher, M & Wiering, F 2024, 'Comparing Audio Boundary Annotation of Vocal Polyphony: Experts, Non-experts, and Algorithms', Paper presented at Sound and Music Computing Conference , Porto, Portugal, 4/07/24 - 6/07/24 pp. 1-8. https://doi.org/10.5281/zenodo.14337733, conference