Multi-modal Arousal and Valence Estimation under Noisy Conditions

Publication date

2024-06-16

Authors

Dresvyanskiy, Denis
Markitantov, Maxim
Yu, Jiawei
Kaya, HeysemORCID 0000-0001-7947-5508ISNI 000000049289651X
Karpov, Alexey

Editors

Advisors

Supervisors

DOI

Document Type

Part of book
Open Access logo

License

cc_by

Abstract

Automatic emotion recognition has gained significant attention over the past two decades due to the central role that emotions play in human communication. While multi-modal systems demonstrate high performances on laboratory-controlled data their validity on non-lab-controlled namely 'in-the-wild' data remains a challenge. This work investigates audio-visual deep learning approaches for emotion recognition in-the-wild with a particular focus on the effectiveness of architectures based on fine-tuned Convolutional Neural Networks (CNN) and Public Dimensional Emotion Model (PDEM) for video and audio modality respectively. We explore and compare various temporal modeling techniques (e.g. transformer architectures) and fusion strategies by leveraging the embeddings from developed multi-stage trained modality-specific Deep Neural Networks (DNN). The results are reported on the AffWild2 dataset following the Affective Behavior Analysis in-the-Wild 2024 (ABAW'24) challenge protocol. Our investigation highlights the complexities of robust multi-modal emotion recognition in an unconstrained environment providing insights into the usage of various deep learning architectures for tackling this challenging task.

Keywords

multimodal emotion recognition, Affective Computing

Citation

Dresvyanskiy, D, Markitantov, M, Yu, J, Kaya, H & Karpov, A 2024, Multi-modal Arousal and Valence Estimation under Noisy Conditions. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. IEEE, pp. 4772-4783. < https://openaccess.thecvf.com/content/CVPR2024W/ABAW/html/Dresvyanskiy_Multi-modal_Arousal_and_Valence_Estimation_under_Noisy_Conditions_CVPRW_2024_paper.html >