Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner.

Bouwer, Renske; Koster, Monica; van den Bergh, Huub

doi:https://doi.org/10.1080/0969594X.2023.2241656

Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner.

Files

Benchmark_rating_procedure_best_of_both_worlds_Comparin... (1011.71 KB)

Publication date

2023-08-11

Authors

Bouwer, Renske

Koster, M.P.

van den Bergh, Huub

DOI

https://doi.org/10.1080/0969594X.2023.2241656

Document Type

Article

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by_nc_nd

Abstract

Assessing students’ writing performance is essential to adequately monitor and promote individual writing development, but it is also a challenge. The present research investigates a benchmark rating procedure for assessing texts written by upper-elementary students. In two studies we examined whether a benchmark rating procedure (1) leads to reliable and generalisable scores that converge with holistic and analytic ratings, and (2) can be used for rating texts varying in topic and genre. Results support evidence that benchmark ratings are a valid indicator of text quality as they converge with holistic and analytic scores. They are also associated with less rater variance and less task-specific variance, leading to reliable and generalisable ratings. Moreover, a benchmark scale can be used for rating different tasks with the same reliability, at least when texts are written in the same genre. Taken together, a benchmark rating procedure ensures meaningful and useful information on students’ writing.

Keywords

Writing assessment, benchmark rating procedure, generalisability, reliability, validity, Education

Citation

Bouwer, R, Koster, M & van den Bergh, H 2023, 'Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner.', Assessment in Education: Principles, Policy and Practice, vol. 30, no. 3-4, pp. 302-319. https://doi.org/10.1080/0969594X.2023.2241656

URI

https://dspace.library.uu.nl/handle/1874/434254

Benchmark rating procedure, best of both worlds? Comparing procedures to rate text quality in a reliable and valid manner.

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI