Emotion based segmentation of musical audio

Publication date

2015

Authors

Aljanaki, A.ISNI 0000000419508357
Wiering, FransORCID 0000-0002-2984-8932ISNI 0000000053360131
Veltkamp, RemcoISNI 0000000109665680

Editors

Advisors

Supervisors

Document Type

Part of book
Open Access logo

License

cc_by

Abstract

The dominant approach to musical emotion variation detection tracks emotion over time continuously and usually deals with time resolutions of one second. In this paper we discuss the problems associated with this approach and propose to move to bigger time resolutions when tracking emotion over time. We argue that it is more natural from the listener’s point of view to regard emotional variation in music as a progression of emotionally stable segments. In order to enable such tracking of emotion over time it is necessary to segment music at the emotional boundaries. To address this problem we conduct a formal evaluation of different segmentation methods as applied to a task of emotional boundary detection. We collect emotional boundary annotations from three annotators for 52 musical pieces from the RWC music collection that already have structural annotations from the SALAMI dataset. We investigate how well structural segmentation explains emotional segmentation and find that there is a large overlap, though about a quarter of emotional boundaries do not coincide with structural ones. We also study inter-annotator agreement on emotional segmentation. Lastly, we evaluate different unsupervised segmentation methods when applied to emotional boundary detection and find that, in terms of F-measure, the Structural Features method performs best.

Keywords

Citation

Aljanaki, A, Wiering, F & Veltkamp, R C 2015, Emotion based segmentation of musical audio. in Proceedings of the 16th Conference of the International Society for Music Information Retrieval (ISMIR 2015). pp. 770-776. https://doi.org/10.5281/zenodo.1418201