Combining multiple imputation and a hierarchical Bayesian model in analysis of complex survey data

a_netti · November 30, 2021, 2:19pm

Hi everyone!

I’m working on a secondary analysis of a nationally representative population sample, where the sampling design includes strata (geographical areas of the country) and clusters (population centers within the strata). Because of differential missing data (attributes of interest to us also affect the probability of participation & loss to follow-up), and there is informative administrative data to use as auxiliary variables, our team has decided to use multiple imputation to account for missing data.

I’d like to tap into the collective wisdom gathered in this forum with two problems regarding how to appropriately combine different methods mentioned in the topic:

Using a hierarchical Bayesian model for the inferences would be most appealing, but I’m uncertain whether simply using the strata and clusters as elements of the hierarchical model is equally appropriate as using Thomas Lumley’s survey package in R when doing frequentist analyses. There also seems to be a lack of literature regarding Bayesian modeling of complex survey data.
Our team’s statistician feels that combining MI to a Bayesian substantive analysis isn’t necessarily appropriate (of course, a full Bayesian model would be more elegant, but flexibly using auxiliary variables seems more doable in MI). However, I understand that the MI+Bayes approach is mentioned in Andrew Gelman’s BDA3 book as an valid option, and it’s also made quite straightforward with Paul Bürkner’s BMRS R package which accepts output from Stef van Buuren’s mice R package out of the box. Combining multiple imputations via Bayesian methods also seems to avoid problems that arise when using Rubin’s rules with statistics that aren’t normally distributed. Would combining MI with Bayesian inference create a Frankenstein’s monster of incompatible methodologies, or be appropriate?

Thank you!

Andreas