A Comparative Study of Large and Small Language Models for Domain Model Extraction

Chou, Cheng Yi; Aydemir, Fatma Başak; Dalpiaz, Fabiano

doi:https://doi.org/10.1007/978-3-032-21423-2_23

A Comparative Study of Large and Small Language Models for Domain Model Extraction

Files

Embargo until 2026-09-25

978-3-032-21423-2_23.pdf (410.17 KB)

How and where to find an Open Access version of this publication?

Publication date

2026-03

Authors

Chou, Cheng Yi

Aydemir, Fatma Basak

Dalpiaz, Fabiano

Editors

Guizzardi, R.

Araújo, J.

DOI

https://doi.org/10.1007/978-3-032-21423-2_23

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

taverne

Abstract

[Context and Motivation] Large language models can derive conceptual models from textual requirements, offering an off-the-shelf alternative to traditional rule-based and machine-learning-based methods. [Question/Problem] Comparative evidence on the validity and completeness of different large and smaller language models for the domain model derivation task remains limited. [Principal ideas/Results] We compare GPT-o1, Llama3-8B, and Qwen-14B with the rule-based Visual Narrator using nine datasets containing user stories and corresponding domain models. Each language model was prompted with structured templates and evaluated on class and association extraction through precision, recall, and F-scores. GPT-o1 outperformed the smaller language models and matched or exceeded Visual Narrator in most tasks. Small language models produced competitive but less consistent results, revealing efficiency–accuracy trade-offs. [Contribution] We provide a systematic comparison of large language models, small language models, and rule-based modeling approaches and offer an updated evaluation framework to guide future research on the balance between scale, performance, and interpretability of the automated techniques for domain model extraction.

Keywords

Narratolarge language models, Visual Narrator, domain modeling, ser stories, small language models, Taverne

Citation

Chou, C Y, Aydemir, F B & Dalpiaz, F 2026, A Comparative Study of Large and Small Language Models for Domain Model Extraction. in R Guizzardi & J Araújo (eds), International Working Conference on Requirements Engineering: Foundation for Software Quality. Springer, pp. 336-351. https://doi.org/10.1007/978-3-032-21423-2_23

URI

https://dspace.library.uu.nl/handle/1874/483893