Confidence intervals for bootstrap-validated bias-corrected performance estimates

Hello,

using the -rms- package to build and internally validate (by bootstrapping) a model, is there a way to get the confidence interval around those bias corrected performance estimates?

Thanks!

Hello Pavel, I had the same question quite a while back.

I’ve edited an earlier version as using dplyr is easier

The code requires capturing the output of each bootstrap from rms’s validate function:

v <- validate(f, B=300) #from rms package
t <- capture.output(validate(f, B=300, pr=TRUE))
validate_CI <- data.frame(output = t) >
separate(output, into = c(“metric”,“train”,“test”), sep = “[[:blank:]]+”) >
mutate(test = as.numeric(test)) >
filter(!is.na(test)) >
select(metric, test) >
mutate(test = as.numeric(test)) >
group_by(metric) >
summarise(mn = mean(test, na.rm = TRUE),
lci = quantile(test, c(0.025), na.rm = TRUE),
uci = quantile(test, c(0.975), na.rm = TRUE)
)

2 Likes

Thank you so much for your help with this.

I think this only gives 95% CI for the “training” data.

Looking through the capture object t, I noticed only training and test performance characteristics but I think what I really need in order to get frank’s intended optimism corrected measures is a calculation of training - optimism = optimism-corrected.

The overfitting-corrected index is in the index.corrected vector in the result from validate. And I’m not understanding John’s code because I expected to see, say, 500 repetitions to form an outer bootstrap loop to compute bootstrap nonparametric percentile confidence intervals. And I would rather use the outer bootstrap to get standard errors and assume normality (rightly or wrongly) to reduce to 100 outer resamples. (validate is doing the inner resamples).

Interestingly, the capture.output function doesnt seem to capture the optimism or index.corrected vectors

All of this can be done easily with basic R. My old-fashioned R style makes it hard for me to read tidyverse code.