One Size Does Not Fit All: On the Role of Batch Size in Classifying Requirements with LLMs

Van Can, Ashley T.; Aydemir, Fatma Basak; Dalpiaz, Fabiano

doi:https://doi.org/10.1109/REW66121.2025.00009

One Size Does Not Fit All: On the Role of Batch Size in Classifying Requirements with LLMs

Files

383400a030.pdf (195.76 KB)

Publication date

2025-10-13

Authors

van Can, Ashley T.

Aydemir, Fatma Başak

Dalpiaz, Fabiano

DOI

https://doi.org/10.1109/REW66121.2025.00009

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

License

taverne

Abstract

Automated requirements classification is a widely explored research topic in requirements engineering. In particular, the distinction between functional and non-functional requirements has received considerable attention. Recently, Large Language Models (LLMs) have demonstrated potential in automating requirement classification tasks. Although existing research emphasizes effective prompting strategies, it provides limited evaluation of how many requirements should be processed within a single prompt sequence as a batch to optimize classifier performance. The batch size is relevant when computational resources are constrained, as minimizing the number of LLM calls becomes essential. Moreover, batching requirements may provide the model with additional contextual information, potentially improving the classification performance. Therefore, this study investigates the impact of batch size on classification performance. We assess how three locally deployable models, Llama3-8B, Gemma3-12B, and DeepSeek-Distill-Qwen 14B, perform in classifying requirements according to their functional and quality aspects. Our findings show that the optimal batch size depends on both the dataset and the model. Selecting a batch size of one by default, which is often used for the classification tasks, does not always yield optimal results. Our findings highlight the importance of selecting a suitable batch size before performing classification tasks.

Keywords

large language models, quality requirements, requirements classification, requirements engineering, Taverne, Artificial Intelligence, Software, Safety, Risk, Reliability and Quality, Modelling and Simulation

Citation

Van Can, A T, Aydemir, F B & Dalpiaz, F 2025, One Size Does Not Fit All : On the Role of Batch Size in Classifying Requirements with LLMs. in Proceedings - 2025 IEEE 33rd International Requirements Engineering Conference Workshops, REW 2025. Proceedings - 2025 IEEE 33rd International Requirements Engineering Conference Workshops, REW 2025, IEEE, pp. 30-39, 33rd IEEE International Requirements Engineering Conference Workshops, REW 2025, Valencia, Spain, 1/09/25. https://doi.org/10.1109/REW66121.2025.00009, conference

URI

https://dspace.library.uu.nl/handle/1874/483219

One Size Does Not Fit All: On the Role of Batch Size in Classifying Requirements with LLMs

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI