Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects

Publication date

2025-02-13

Authors

Eijkelboom, Isaak
Schulp, AnneORCID 0000-0001-9389-1540ISNI 0000000112948139
Amkreutz, Luc
Verheul, Dylan
Verschoof-van der Vaart, Wouter
Van der Vaart-Verschoof, Sasja
Hogeweg, Laurens
Brunink, Django
Mol, Dick
Peeters, Hans

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

Abstract

Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project’s full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.

Keywords

AI, Archaeology, Citizen science, Palaeontology, Project design, General Neuroscience, General Biochemistry,Genetics and Molecular Biology, General Agricultural and Biological Sciences, SDG 16 - Peace, Justice and Strong Institutions

Citation

Eijkelboom, I, Schulp, A S, Amkreutz, L, Verheul, D, Verschoof-van der Vaart, W, Van der Vaart-Verschoof, S, Hogeweg, L, Brunink, D, Mol, D, Peeters, H & Wesselingh, F 2025, 'Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects', PeerJ, vol. 13, no. 2, e18927. https://doi.org/10.7717/peerj.18927