Leveraging machines to derive domain models from user stories

Bragilovski, Maxim; van Can, Ashley T.; Dalpiaz, Fabiano; Sturm, Arnon

doi:https://doi.org/10.1007/s00766-025-00442-9

Leveraging machines to derive domain models from user stories

Files

s00766-025-00442-9.pdf (2.76 MB)

Publication date

2025

Authors

Bragilovski, Maxim

van Can, Ashley T.

Dalpiaz, Fabiano

Sturm, Arnon

DOI

https://doi.org/10.1007/s00766-025-00442-9

Document Type

Article

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by

Abstract

Domain models play a crucial role in software development, as they provide means for communication among stakeholders, for eliciting requirements, and for representing the information structure behind a database scheme or for model-driven development. However, creating such models is a tedious activity and automated support may assist in obtaining an initial domain model that can later be enriched by human analysts. In this paper, we compare the effectiveness of various approaches for deriving domain models from a given set of user stories. We contrast human derivation (of both experts and novices) with machine derivation; for the latter, we compare (i) the Visual Narrator: an existing rule-based NLP approach; (ii) a machine learning classifier that we feature engineered; and (iii) a generative AI approach that we constructed via prompt engineering with multiple configurations. Based on a benchmark dataset comprising nine collections of user stories and their corresponding domain models, the evaluation shows that while no approach matches human performance, large language models (LLMs) are not statistically outperformed by human experts in deriving classes. Additionally, a tuned version of the machine learning approach achieves results close to human performance in deriving associations. To better understand the results, we qualitatively analyze them and identify differences in the types of false positives as well as other factors that affect performance.

Keywords

Domain models, Large language models, Machine learning, Model derivation, Requirements engineering, User stories, Software, Information Systems

Citation

Bragilovski, M, van Can, A T, Dalpiaz, F & Sturm, A 2025, 'Leveraging machines to derive domain models from user stories', Requirements Engineering, vol. 30, no. 2, pp. 241–262. https://doi.org/10.1007/s00766-025-00442-9

URI

https://dspace.library.uu.nl/handle/1874/476635

Leveraging machines to derive domain models from user stories

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI