Listening to Oral History: Emotion Annotation and Recognition in the ACT UP Oral History Project

Pessanha, Francisca; Padovani, Ian; van Klaveren, Justus; Kaya, Heysem; Akdag, Almila; Masthoff, Judith

doi:https://doi.org/10.1145/3746273.3760204

Listening to Oral History: Emotion Annotation and Recognition in the ACT UP Oral History Project

Files

3746273.3760204.pdf (3.45 MB)

Publication date

2025-10-26

Authors

Pessanha, Francisca

Padovani, Ian

van Klaveren, Justus

Kaya, Heysem

Akdag, Almila

Masthoff, Judith

DOI

https://doi.org/10.1145/3746273.3760204

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by_nc

Abstract

Oral History Archives (OHA) have increasingly benefited from computational methods to process large interview collections, leveraging automatic speech recognition and natural language processing techniques to transcribe and navigate spoken content. However, oral historians argue that transcripts alone lack the emotional depth and contextual nuance conveyed through audio. In this work, we investigate the distinction between ''what is said'' and ''how it is said'' in the ACT UP Oral History Project through an emotion annotation study. We propose a modality (dis)agreement-based method to pre-select emotionally rich samples by comparing textual and audio-based emotion predictions, and we provide an open-source emotion annotation tool. Additionally, we extend Krippendorff's alpha by incorporating VAD-space distance to more accurately illustrate annotator reliability. We then use these annotations to train unimodal and multimodal emotion recognition models. Our findings highlight the inherent ambiguity of emotion annotation, especially in paralinguistics, underscoring how speech both enriches and complicates the emotional interpretation of content. Furthermore, annotator confidence ratings significantly correlate with agreement levels, offering a proxy for ambiguity. Finally, when comparing the performance of 4-class classifiers and regression models for emotion recognition, we find that paralinguistic features are more informative than linguistic ones for the task at hand, and that regression models better capture the emotional ambiguity of the ACT UP Oral History Project interviews. We achieved a macro F1-score of 0.66 on the test set using emotion-specific regression models, in line with the state-of-the-art in Oral History.

Keywords

Oral History Archives, Cultural Heritage, Emotion Annotation, Emotion Recognition

Citation

Pessanha, F, Padovani, I, van Klaveren, J, Kaya, H, Akdag, A & Masthoff, J 2025, Listening to Oral History: Emotion Annotation and Recognition in the ACT UP Oral History Project. in SUMAC '25: Proceedings of the 7th International Workshop on analySis, Understanding and proMotion of heritAge Contents. Association for Computing Machinery, pp. 41-50. https://doi.org/10.1145/3746273.3760204

URI

https://dspace.library.uu.nl/handle/1874/483177

Listening to Oral History: Emotion Annotation and Recognition in the ACT UP Oral History Project

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI