# Internal model validation (Bootstrapping) and Multiple Imputation

What is recommended way to incorporate both bootstrapping (i.e. to compute optimism in the R-squared value) and multiple imputation?

For example, is it acceptable to:
(1) Perform multiple imputation to generate x number of datasets (e.g. x = 10)
(2) Fit a pre-specified regression model separately to each of the 10 imputed datasets and then run a separate bootstrap for each dataset
(3) Report the average and range of optimism values (e.g. for R-squared) from the 10 bootstraps.

Any guidance would greatly be appreciated!

1 Like

You may have come across this (free online) book by Stefan van Buuren: https://stefvanbuuren.name/fimd/. The book discusses combining imputation and bootstrapping. Lots of good references therein.

Another paper I have in my ref manager but which I havenâ€™t read yet: https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.7654.

From what I recall, the imputation procedure should be nested within the bootstrap procedure to account for uncertainty properly. Something like the following (arbitrary numbers):

1. Draw 500 bootstrap samples from the original dataset
2. Within each bootstrap sample, multiply impute 10 datasets and get a pooled estimate for the quantity of interest. (In all, youâ€™d generate 5,000 imputed datasets.)
3. Use the distribution of 500 pooled estimates for inference.
3 Likes

Jonathan Bartlett and Rachael Hughes published a paper last year. Simulations showed that â€śImputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods doâ€ť.
see https://journals.sagepub.com/doi/full/10.1177/0962280220932189

3 Likes

Hi,
Can somebody help me with how to interpret the internal validation results in rms?
For example,

validate(cox.tox.3, method = â€śbootâ€ť, B = 500)
index.orig training test optimism index.corrected n
Dxy 0.3221 0.3378 0.3151 0.0227 0.2994 500
R2 0.2821 0.3020 0.2601 0.0419 0.2402 500
Slope 1.0000 1.0000 0.8747 0.1253 0.8747 500
D 0.0370 0.0403 0.0336 0.0067 0.0303 500
U -0.0006 -0.0006 0.0016 -0.0022 0.0016 500
Q 0.0376 0.0409 0.0320 0.0089 0.0286 500
g 0.8315 0.8728 0.7634 0.1094 0.7221 500