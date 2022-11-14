Yajuan Si

Research Associate Professor, Institute for Social Research

University of Michigan

Description: Multilevel regression and poststratification (MRP) has become a popular approach for selection bias adjustment in subgroup estimation, with widespread applications from social sciences to public health. We examine the finite population inferential validity of MRP in connection with poststratification and model specification. The success of MRP prominently depends on the availability of auxiliary information strongly related to the outcome. To improve the outcome model fitting performances, we recommend modeling inclusion mechanisms conditional on auxiliary variables and adding flexible functions of estimated inclusion probabilities as predictors in the mean structure. We present a framework for statistical data integration and robust inferences of probability and nonprobability surveys, providing solutions to various challenges in practical applications. Our simulation studies indicate the statistical validity of MRP with a tradeoff between bias and variance, and the improvement over alternative methods is mainly on subgroup estimates with small sample sizes. Our development is motivated by the Adolescent Brain Cognitive Development (ABCD) Study that has collected children's information across 21 U.S. geographic locations for national representation but is subject to selection bias as a nonprobability sample. We apply the methods for population inferences to evaluate the cognition measure of diverse groups of children in the ABCD study and demonstrate that the use of auxiliary variables affects the inferential findings.

Bio: Yajuan Si is a Research Associate Professor in the Institute for Social Research at the University of Michigan. Dr Si’s research lies in cutting-edge methodology development in streams of Bayesian statistics, linking design- and model-based approaches for survey inference, missing data analysis, confidentiality protection involving the creation and analysis of synthetic datasets, and causal inference with observational data.