MITACS Seminars on Statistical Methods for Complex Surveys

Parametric fractional imputation for missing data analysis

Jaekwang Kim, Iowa State University

Abstract: Parameter estimation with missing data is a frequently encountered problem in statistics and has wide applications. Under a parametric model for the missing data, EM algorithm is a popular tool for finding the maximum likelihood estimates (MLE) of the parameters specified in the model. Imputation, when carefully done, can be used to facilitate the parameter estimation by simply applying the complete-sample estimators to the imputed dataset. The basic idea is to generate the imputed values from the conditional distribution of the missing data given the observed data. Multiple imputation of Rubin is a Bayesian approach of generating the imputed values from the conditional distribution. In this talk, parametric fractional imputation is proposed as a parametric approach of generating imputed values. Using the fractional weights, the E-step of the EM algorithm can be approximated by the weighted mean of the imputed data likelihood where the fractional weights are computed from the current value of the parameter estimates. Some computational advantage over the existing methods can be achieved using the idea of importance sampling in the Monte Carlo approximation of the conditional expectation. The resulting estimator of the specified parameters can be identical to the MLE under missing data if the fractional weights are adjusted using a calibration step. The proposed imputation method provides very efficient parameter estimates for the parameters specified in the model and, at the same time, also provides reasonable estimates for the parameters that was not considered in the imputation model, such as domain means. Thus, the proposed imputation method is a very useful tool for general-purpose data analysis. Variance estimation is also covered. Results from a limited simulation study are also presented.