Evaluation of classification models for retrieving experimental sections from full-text publications

Lefebvre, Armel; Berendsen, Jorrit; Spruit, Marco

Evaluation of classification models for retrieving experimental sections from full-text publications

Files

fc69ac9f7fde176881a35430d04834554ac5.pdf (114.64 KB)

Publication date

2019

Authors

Lefebvre, A.

Berendsen, Jorrit

Spruit, M.R.

Document Type

Report

Metadata

Show full item record

Collections

Utrecht University Repository

Abstract

In recent years, reporting scientific experiments became a challenge for scientists working data-intensive research fields. One of these challenges is to accurately report experimental work relying on computational activities. In this report, an exploratory computational experiment is conducted. We evaluate the performance of a set of classification models to extract experimental paragraphs from full-text scientific publications in an unsupervised fashion. The results show that the best performing classification model (Multinomial Naive Bayes) trained on 30 publications in the Proteomics domain achieves a Recall of 87.12% and an Accuracy of 80.63%. Successful unsupervised extraction of experimental paragraphs from reports can considerably reduce the noise present in full-text publications. This approach could be beneficial to automatically generate domain specific vocabulary describing experimental designs and experimental processes. As such, this work contributes to the identification of NLP techniques automatizing the extraction of domain-specific paragraphs which relate to experimental work.

Citation

Lefebvre, A, Berendsen, J & Spruit, M 2019, Evaluation of classification models for retrieving experimental sections from full-text publications. Technical Report Series, no. UU-CS-2019-002, UU BETA ICS Departement Informatica, Utrecht.

URI

https://dspace.library.uu.nl/handle/1874/390074

Evaluation of classification models for retrieving experimental sections from full-text publications

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI