Listening to Oral History: Emotion Annotation and Recognition in the ACT UP Oral History Project

Publication date

2025-10-26

Authors

Pessanha, FranciscaORCID 0000-0002-3711-7814ISNI 0000000524640122
Padovani, Ian
van Klaveren, Justus
Kaya, HeysemORCID 0000-0001-7947-5508ISNI 000000049289651X
Akdag, AlmilaORCID 0000-0002-7204-5633ISNI 0000000050543653
Masthoff, JudithISNI 000000012419854X

Editors

Advisors

Supervisors

Document Type

Part of book
Open Access logo

License

cc_by_nc

Abstract

Oral History Archives (OHA) have increasingly benefited from computational methods to process large interview collections, leveraging automatic speech recognition and natural language processing techniques to transcribe and navigate spoken content. However, oral historians argue that transcripts alone lack the emotional depth and contextual nuance conveyed through audio. In this work, we investigate the distinction between ''what is said'' and ''how it is said'' in the ACT UP Oral History Project through an emotion annotation study. We propose a modality (dis)agreement-based method to pre-select emotionally rich samples by comparing textual and audio-based emotion predictions, and we provide an open-source emotion annotation tool. Additionally, we extend Krippendorff's alpha by incorporating VAD-space distance to more accurately illustrate annotator reliability. We then use these annotations to train unimodal and multimodal emotion recognition models. Our findings highlight the inherent ambiguity of emotion annotation, especially in paralinguistics, underscoring how speech both enriches and complicates the emotional interpretation of content. Furthermore, annotator confidence ratings significantly correlate with agreement levels, offering a proxy for ambiguity. Finally, when comparing the performance of 4-class classifiers and regression models for emotion recognition, we find that paralinguistic features are more informative than linguistic ones for the task at hand, and that regression models better capture the emotional ambiguity of the ACT UP Oral History Project interviews. We achieved a macro F1-score of 0.66 on the test set using emotion-specific regression models, in line with the state-of-the-art in Oral History.

Keywords

Oral History Archives, Cultural Heritage, Emotion Annotation, Emotion Recognition

Citation

Pessanha, F, Padovani, I, van Klaveren, J, Kaya, H, Akdag, A & Masthoff, J 2025, Listening to Oral History: Emotion Annotation and Recognition in the ACT UP Oral History Project. in SUMAC '25: Proceedings of the 7th International Workshop on analySis, Understanding and proMotion of heritAge Contents. Association for Computing Machinery, pp. 41-50. https://doi.org/10.1145/3746273.3760204