The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation

Pavlovic, Maja; Poesio, Massimo

The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation

Files

2024.nlperspectives-1.11.pdf (698.48 KB)

Publication date

2024

Authors

Pavlovic, Maja

Poesio, Massimo

Editors

Abercrombie, Gavin

Basile, Valerio

Bernardi, Davide

Dudy, Shiran

Frenda, Simona

Havens, Lucy

Tonelli, Sara

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by_nc

Abstract

Large Language Models (LLMs) have emerged as powerful support tools across various natural language tasks and a range of application domains. Recent studies focus on exploring their capabilities for data annotation. This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data. While the models demonstrate promising cost and time-saving benefits, there exist considerable limitations, such as representativeness, bias, sensitivity to prompt variations and English language preference. Leveraging insights from these studies, our empirical analysis further examines the alignment between human and GPT-generated opinion distributions across four subjective datasets. In contrast to the studies examining representation, our methodology directly obtains the opinion distribution from GPT. Our analysis thereby supports the minority of studies that are considering diverse perspectives when evaluating data annotation tasks and highlights the need for further research in this direction.

Keywords

annotation/labelling, large language model (llm), representation, Language and Linguistics, Education, Library and Information Sciences, Linguistics and Language

Citation

Pavlovic, M & Poesio, M 2024, The Effectiveness of LLMs as Annotators : A Comparative Overview and Empirical Analysis of Direct Representation. in G Abercrombie, V Basile, D Bernardi, S Dudy, S Frenda, L Havens & S Tonelli (eds), 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024 at LREC-COLING 2024 - Workshop Proceedings. 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024 at LREC-COLING 2024 - Workshop Proceedings, European Language Resources Association (ELRA), pp. 100-110, 3rd Workshop on Perspectivist Approaches to NLP, NLPerspectives 2024, Torino, Italy, 21/05/24. < https://aclanthology.org/2024.nlperspectives-1.11 >, conference

URI

https://dspace.library.uu.nl/handle/1874/482106

The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI