Fully-attentive and interpretable: vision and video vision transformers for pain detection

Fiorentini, Giacomo; Önal Ertuğrul, Itir; Salah, Albert

Fully-attentive and interpretable: vision and video vision transformers for pain detection

Files

VTTA_25_fully_attentive_and_interpretable.pdf (1.7 MB)

Publication date

2022-12

Authors

Fiorentini, Giacomo

Önal Ertuğrul, Itir

Salah, Albert Ali

Document Type

Contribution to conference

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by

Abstract

Pain is a serious and costly issue globally, but to be treated, it must first be detected. Vision transformers are a top-performing architecture in computer vision, with little research on their use for pain detection. In this paper, we propose the first fully-attentive automated pain detection pipeline that achieves state-of-the-art performance on binary pain detection from facial expressions. The model is trained on the UNBC-McMaster dataset, after faces are 3D-registered and rotated to the canonical frontal view. In our experiments we identify important areas of the hyperparameter space and their interaction with vision and video vision transformers, obtaining 3 noteworthy models. We analyse the attention maps of one of our models, finding reasonable interpretations for its predictions. We also evaluate Mixup, an augmentation technique, and Sharpness-Aware Minimization, an optimizer, with no success. Our presented models, ViT-1 (F1 score 0.55 +- 0.15), ViViT-1 (F1 score 0.55 +- 0.13), and ViViT-2 (F1 score 0.49 +- 0.04), all outperform earlier works, showing the potential of vision transformers for pain detection.

Citation

Fiorentini, G, Önal Ertuğrul, I & Salah, A 2022, 'Fully-attentive and interpretable : vision and video vision transformers for pain detection', Paper presented at NeurIPS 2022, 28/11/22 - 9/12/22. < https://sites.google.com/view/vtta-neurips2022/accepted-papers >, conference

URI

http://hdl.handle.net/1874/426457

Fully-attentive and interpretable: vision and video vision transformers for pain detection

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI