Gradations of Error Severity in Automatic Image Descriptions

van Miltenburg, Emiel; Lu, Wei-Ting; Krahmer, Emiel; Gatt, Albert; Chen, Guanyi; Li, Lin; van Deemter, Kees

Gradations of Error Severity in Automatic Image Descriptions

Files

2020.inlg_1.45.pdf (6.36 MB)

Publication date

2020-12-01

Authors

van Miltenburg, Emiel

Lu, Wei-Ting

Krahmer, Emiel

Gatt, Albert

Chen, Guanyi

Li, Lin

van Deemter, Kees

Editors

Davis, Brian

Graham, Yvette

Kelleher, John

Sripada, Yaji

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

taverne

Abstract

Earlier research has shown that evaluation metrics based on textual similarity (e.g., BLEU, CIDEr, Meteor) do not correlate well with human evaluation scores for automatically generated text. We carried out an experiment with Chinese speakers, where we systematically manipulated image descriptions to contain different kinds of errors. Because our manipulated descriptions form minimal pairs with the reference descriptions, we are able to assess the impact of different kinds of errors on the perceived quality of the descriptions. Our results show that different kinds of errors elicit significantly different evaluation scores, even though all erroneous descriptions differ in only one character from the reference descriptions. Evaluation metrics based solely on textual similarity are unable to capture these differences, which (at least partially) explains their poor correlation with human judgments. Our work provides the foundations for future work, where we aim to understand why different errors are seen as more or less severe.

Keywords

Taverne

Citation

van Miltenburg, E, Lu, W-T, Krahmer, E, Gatt, A, Chen, G, Li, L & van Deemter, K 2020, Gradations of Error Severity in Automatic Image Descriptions. in B Davis, Y Graham, J Kelleher & Y Sripada (eds), Proceedings of the 13th International Conference on Natural Language Generation. Association for Computational Linguistics, Dublin, Ireland, pp. 398-411. < https://www.aclweb.org/anthology/2020.inlg-1.45 >

URI

http://hdl.handle.net/1874/414718

Gradations of Error Severity in Automatic Image Descriptions

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI