Checkbox grading of large-scale mathematics exams with multiple assessors: Field study on assessors’ inter-rater reliability, time investment and usage experience
Publication date
2025-06
Editors
Advisors
Supervisors
Document Type
Article
Metadata
Show full item recordCollections
License
cc_by
Abstract
Assessing exams with multiple assessors is challenging regarding inter-rater reliability and feedback. This paper presents ‘checkbox grading,’ a digital method where exam designers have predefined checkboxes with both feedback and associated partial grades. Assessors then tick the checkboxes relevant to a student solution. Dependencies between checkboxes ensure consistency among assessors in following the grading scheme. Moreover, the approach supports ‘blind grading’ by hiding the grades associated with the checkboxes, thus focusing assessors on the criteria rather than the scores. The approach was studied during a large-scale mathematics state exam. Results show that assessors perceived checkbox grading as very useful. However, compared to traditional grading—where assessors follow a correction scheme and communicate the resulting grade—more time is spent on checkbox grading, while both approaches are equally reliable. Blind grading improved inter-rater reliability for some tasks. Overall, checkbox grading might lead to a smoother process where feedback, not solely grades, is communicated to students.
Keywords
Assessment, Computer-assisted assessment, Feedback, Inter-rater reliability, State examinations, Education
Citation
Moons, F, Vandervieren, E & Colpaert, J 2025, 'Checkbox grading of large-scale mathematics exams with multiple assessors: Field study on assessors’ inter-rater reliability, time investment and usage experience', Studies in Educational Evaluation, vol. 85, 101443. https://doi.org/10.1016/j.stueduc.2024.101443