The CLIN33 Shared Task on the Detection of Text Generated by Large Language Models

Fivez, Pieter; Daelemans, Walter; Van de Cruys, Tim; Kashnitsky, Yury; Chamezopoulos, Savvas; Mohammadi, Hadi; Giachanou, Anastasia; Bagheri, Ayoub; Poelman, Wessel; Vladika, Juraj; Ploeger, Esther; Bjerva, Johannes; Matthes, Florian; van Halteren, Hans

The CLIN33 Shared Task on the Detection of Text Generated by Large Language Models

Files

s12974-025-03593-2.pdf (3.18 MB)

Publication date

2024-03-21

Authors

Fivez, Pieter

Daelemans, Walter

Van de Cruys, Tim

Kashnitsky, Yury

Chamezopoulos, Savvas

Mohammadi, Hadi

Giachanou, Anastasia

Bagheri, Ayoub

Poelman, Wessel

Vladika, Juraj

Document Type

/dk/atira/pure/researchoutput/researchoutputtypes/contributiontojournal/conferencearticle

Metadata

Show full item record

Collections

UMC Repository

License

cc_by_nc_nd

Abstract

The Shared Task for CLIN33 focuses on a relatively novel yet societally relevant task: the detection of text generated by Large Language Models (LLMs). We frame this detection task as a binary classification problem (LLM-generated or not), using test data from up to 6 different domains and text genres for both Dutch and English. Part of this test data was held out entirely from the contestants, including a”mystery genre” which belonged to an unknown domain (later revealed to be columns). Four teams submitted 11 runs with substantially different models and features. This paper gives an overview of our task setup and contains the evaluation and detailed descriptions of the participating systems. Notably, included in the winning systems are both deep learning models as well as more traditional machine learning models leveraging task-specific feature engineering.

Keywords

Taverne, Language and Linguistics, Linguistics and Language, Computer Science Applications, Logic

Citation

Fivez, P, Daelemans, W, Van de Cruys, T, Kashnitsky, Y, Chamezopoulos, S, Mohammadi, H, Giachanou, A, Bagheri, A, Poelman, W, Vladika, J, Ploeger, E, Bjerva, J, Matthes, F & van Halteren, H 2024, 'The CLIN33 Shared Task on the Detection of Text Generated by Large Language Models', Computational Linguistics in the Netherlands Journal, vol. 13, pp. 233-259. < https://clinjournal.org/clinj/article/view/182 >

URI

https://dspace.library.uu.nl/handle/1874/468974

The CLIN33 Shared Task on the Detection of Text Generated by Large Language Models

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI