Dealing with missing data using the Heckman selection model: methods primer for epidemiologists
Files
Publication date
2023-02-01
Editors
Advisors
Supervisors
Document Type
Article
Metadata
Show full item recordCollections
License
taverne
Abstract
Missing data is a common problem in epidemiologic studies and is often addressed by omitting incomplete records or adopting multiple imputation. Although these methods can produce unbiased estimates of study associations, their validity becomes problematic when data are missing not at random (MNAR), and the missing data mechanism is nonignorable. This situation typically arises when the presence of missing values depends on characteristics of the measurement or recording process, which is common in surveys and databases with electronic healthcare records. In this article, we discuss the relevance and implementation of Heckman selection models to impute variables that are missing not at random.
Keywords
Heckman selection model, exclusion restriction variables, selection bias, missing data, causal inference, real world data, Taverne, Epidemiology
Citation
Muñoz, J, Hufstedler, H, Gustafson, P, Bärnighausen, T, De Jong, V M T & Debray, T P A 2023, 'Dealing with missing data using the Heckman selection model : methods primer for epidemiologists', International Journal of Epidemiology, vol. 52, no. 1, pp. 5-13. https://doi.org/10.1093/ije/dyac237