Multi-modal Arousal and Valence Estimation under Noisy Conditions

Dresvyanskiy, Denis; Markitantov, Maxim; Yu, Jiawei; Kaya, Heysem; Karpov, Alexey

Multi-modal Arousal and Valence Estimation under Noisy Conditions

Files

Dresvyanskiy_Multi-modal_Arousal_and_Valence_Estimation... (1.32 MB)

Publication date

2024-06-16

Authors

Dresvyanskiy, Denis

Markitantov, Maxim

Yu, Jiawei

Kaya, Heysem

Karpov, Alexey

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by

Abstract

Automatic emotion recognition has gained significant attention over the past two decades due to the central role that emotions play in human communication. While multi-modal systems demonstrate high performances on laboratory-controlled data their validity on non-lab-controlled namely 'in-the-wild' data remains a challenge. This work investigates audio-visual deep learning approaches for emotion recognition in-the-wild with a particular focus on the effectiveness of architectures based on fine-tuned Convolutional Neural Networks (CNN) and Public Dimensional Emotion Model (PDEM) for video and audio modality respectively. We explore and compare various temporal modeling techniques (e.g. transformer architectures) and fusion strategies by leveraging the embeddings from developed multi-stage trained modality-specific Deep Neural Networks (DNN). The results are reported on the AffWild2 dataset following the Affective Behavior Analysis in-the-Wild 2024 (ABAW'24) challenge protocol. Our investigation highlights the complexities of robust multi-modal emotion recognition in an unconstrained environment providing insights into the usage of various deep learning architectures for tackling this challenging task.

Keywords

multimodal emotion recognition, Affective Computing

Citation

Dresvyanskiy, D, Markitantov, M, Yu, J, Kaya, H & Karpov, A 2024, Multi-modal Arousal and Valence Estimation under Noisy Conditions. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. IEEE, pp. 4772-4783. < https://openaccess.thecvf.com/content/CVPR2024W/ABAW/html/Dresvyanskiy_Multi-modal_Arousal_and_Valence_Estimation_under_Noisy_Conditions_CVPRW_2024_paper.html >

URI

https://dspace.library.uu.nl/handle/1874/482121

Multi-modal Arousal and Valence Estimation under Noisy Conditions

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI