A Bayesian Prevalence-Incidence Mixture Model for Screening Outcomes With Misclassification

Publication date

2026-04

Authors

Klausch, Thomas
Lissenberg-Witte, Birgit IORCID 0000-0001-9448-1826
Coupé, Veerle M H

Editors

Advisors

Supervisors

Document Type

Article

Collections

Open Access logo

License

cc_by

Abstract

Screening and surveillance programs for cancer, such as colorectal cancer (CRC), often yield electronic health records (EHR) of screening time, test results, and covariates. We consider EHR from CRC surveillance of individuals who have a high cancer risk due to their family history. These individuals, therefore, receive regular colonoscopies with the goal of finding and removing adenomas, precursor lesions to CRC. Our objective is to estimate time to adenoma incidence and explore associations with covariates. However, in doing so, several challenges of the CRC surveillance EHR have to be addressed. Importantly, the adenoma events are interval-censored, meaning the exact event times are unknown and only fall within intervals defined by colonoscopy visits. Furthermore, colonoscopies can miss adenomas due to human or technical error, leading to misclassification of individuals with adenomas as adenoma-free. Finally, the EHR data include individuals with adenomas at baseline, termed prevalent cases. This prevalence status may be unobserved if the baseline colonoscopy is missing or fails to detect existing adenomas. To address these challenges in the CRC EHR, and screening data in general, we develop a new prevalence-incidence mixture model (PIM) with a Bayesian estimation back-end through data augmentation and regularization priors. We show how to fit the model, estimate cumulative incidence functions, and evaluate model fit using information criteria as well as a non-parametric estimator. In extensive simulations, we show good performance of the model when informative priors on the test sensitivity are provided, which is usually possible. An implementation in the R package BayesPIM is provided.

Keywords

Adenoma/epidemiology, Bayes Theorem, Colonoscopy/statistics & numerical data, Colorectal Neoplasms/epidemiology, Computer Simulation, Early Detection of Cancer/statistics & numerical data, Electronic Health Records, Female, Humans, Incidence, Male, Mass Screening/statistics & numerical data, Models, Statistical, Prevalence, Journal Article

Citation

Klausch, T, Lissenberg-Witte, B I & Coupé, V M H 2026, 'A Bayesian Prevalence-Incidence Mixture Model for Screening Outcomes With Misclassification', Statistics in Medicine, vol. 45, no. 8-9, e70433. https://doi.org/10.1002/sim.70433