Statistical inference based on randomly generated auxiliary variables
Publication date
2018-01
Editors
Advisors
Supervisors
Document Type
Article
Metadata
Show full item recordCollections
License
taverne
Abstract
In most real life studies, auxiliary variables are available and are employed to explain and understand missing data patterns and to evaluate and control causal relationships with variables of interest. Usually their availability is assumed to be a fact, even if the variables are measured without the objectives of the study in mind. As a result, inference with missing data and causal inference require some assumptions that cannot easily be validated or checked. In this paper, a framework is constructed in which auxiliary variables are treated as a selection, possibly random, from the universe of variables on a population. This framework provides conditions to make statistical inference beyond the traces of bias or effects found by the auxiliary variables themselves. The utility of the framework is demonstrated for the analysis and reduction of non‐response in surveys. However, the framework may be more generally used to understand the strength of association between variables. Important roles are played by the diversity and diffusion of the population of interest, features that are defined in the paper and the estimation of which is discussed.
Keywords
Causal inference, Independent variable, Missing data, Non-response; Surveys, Taverne
Citation
Schouten, J G 2018, 'Statistical inference based on randomly generated auxiliary variables', Journal of the Royal Statistical Society. Series B: Statistical Methodology, vol. 80, no. 1, pp. 33-56. https://doi.org/10.1111/rssb.12242