A case-based-reasoning analysis of the COMPAS dataset
Publication date
2024
Editors
Savelka, J.
Harasta, J.
Novotna, T.
Misek, J.
Advisors
Supervisors
Document Type
Part of book
Metadata
Show full item recordCollections
License
cc_by_nc
Abstract
In this paper we build on a formal model of reasoning with dimensions to analyze data from the COMPAS program—a widely used and studied tool for predicting recidivism. We extend the underlying theory of the model by introducing a notion of consistency and apply it to assess whether COMPAS follows this principle in its risk assessments and supervision level recommendations. Our analysis yields three key findings. First, the program’s risk score assignments appear highly inconsistent, but we argue this is due to important input features missing from the dataset. Second, the program’s recommended supervision levels do exhibit a high degree of consistency. Third, we uncover errors in the dataset related to the conversion of raw scores to decile scores. These findings cast doubts on previous studies conducted on the COMPAS dataset, and demonstrate the need for evaluation studies like ours.
Keywords
Citation
van Woerkom, W, Grossi, D, Prakken, H & Verheij, B 2024, A case-based-reasoning analysis of the COMPAS dataset. in J Savelka, J Harasta, T Novotna & J Misek (eds), Legal Knowledge and Information Systems : The 37th Annual Conference. Frontiers in Artificial Intelligence and Applications, vol. 395, IOS Press, Amsterdam, Washington DC, pp. 180-190. https://doi.org/10.3233/FAIA241244