Recursive partitioning for missing data imputation in the presence of interaction effects
Files
Publication date
2014-01-01
Editors
Advisors
Supervisors
Document Type
Article
Metadata
Show full item recordCollections
License
Abstract
Standard approaches to implement multiple imputation do not automatically incorporate nonlinear relations like interaction effects. This leads to biased parameter estimates when interactions are present in a dataset. With the aim of providing an imputation method which preserves interactions in the data automatically, the use of recursive partitioning as imputation method is examined. Three recursive partitioning techniques are implemented in the multiple imputation by chained equations framework. It is investigated, using simulated data, whether recursive partitioning creates appropriate variability between imputations and unbiased parameter estimates with appropriate confidence intervals. It is concluded that, when interaction effects are present in a dataset, substantial gains are possible by using recursive partitioning for imputation compared to standard applications. In addition, it is shown that the potential of recursive partitioning imputation approaches depends on the relevance of a possible interaction effect, the correlation structure of the data, and the type of possible interaction effect present in the data.
Keywords
CART, Classification and regression trees, Interaction problem, MICE, Nonlinear relations, Random forests, Computational Mathematics, Computational Theory and Mathematics, Statistics and Probability, Applied Mathematics
Citation
Doove, L L, Van Buuren, S & Dusseldorp, E 2014, 'Recursive partitioning for missing data imputation in the presence of interaction effects', Computational Statistics and Data Analysis, vol. 72, pp. 92-104. https://doi.org/10.1016/j.csda.2013.10.025