On the advantage of using dedicated data mining techniques to predict colorectal cancer
Files
Publication date
2015
Editors
Holmes, John H.
Bellazzi, Riccardo
Sacchi, Lucia
Peek, Niels
Advisors
Supervisors
Document Type
Part of book
Metadata
Show full item recordCollections
License
taverne
Abstract
Electronic Medical Records (EMRs) provide a wealth of data that can be used to generate predictive models for diseases. Quite some studies have been performed that use EMRs to generate such models for specific diseases, but most of them are based on more traditional techniques used in medical domain, such as logistic regression. This paper studies the benefit of using advanced data mining techniques for Colorectal Cancer (CRC). CRC is the second most common cancer in the EU and is known to be a disease with very a-specific predictors, making it difficult to generate good predictive models. In addition, the EMR data itself has its own challenges, including the sparsity, the differences in which physicians code the data, the temporal nature of the data, and the imbalance in the data. Results show that state-of-the-art data mining techniques, including temporal data mining, are able to generate better predictive models than currently available in the literature.
Keywords
Colorectal cancer, Data mining, Machine learning, Taverne, General Computer Science, Theoretical Computer Science
Citation
Kop, R, Hoogendoorn, M, Moons, L M G, Numans, M E & ten Teije, A 2015, On the advantage of using dedicated data mining techniques to predict colorectal cancer. in J H Holmes, R Bellazzi, L Sacchi & N Peek (eds), Artificial Intelligence in Medicine : 15th Conference on Artificial Intelligence in Medicine, AIME 2015, Pavia, Italy, June 17-20, 2015. Proceedings. Lecture Notes in Computer Science. Lecture Notes in Artificial Intelligence , vol. 9105, Springer-Verlag, pp. 133-142, 15th Conference on Artificial Intelligence in Medicine, AIME 2015, Pavia, Italy, 17/06/15. https://doi.org/10.1007/978-3-319-19551-3_16, conference