About the data analysis category

f2harrell · July 13, 2018, 1:09pm

Data analysis in the wide sense, including exploratory and formal analysis, description, inference, prediction, unsupervised learning, statistical validation, analysis strategies

deb_agbe · January 21, 2019, 1:10pm

Dear All,

I am not sure I am posting in the right place.

I am trying to predict resistance to treatment with a Lasso-Cox model trained on 1435 patients follow for 10 years, 181 events and 33 variables (both continuous and categorical for a total of approximately 50 covariates in the model if we count the dummies).

I have a question about bootstrap optimism-correction for calibration with 1000 bootstrap samples for the developed Lasso-Cox model.

My Lasso Cox has an apparent C-index of 0.66 and an apparent calibration slope of 1.73.

The optimism-corrected estimates are:

C-index_corr = 0.66 - 0.15 = 0.51

Calibration slope_corr = 1.73 - 1.77 = -0.04

I understand that the estimate of the calibration slope has such a large optimism because Lasso-Cox is a biased estimator. So, how would you interpret this result? Is it sensible to assess calibration in this case? Would you re-calibrate the model? Is optimism-correction valid for machine learning methods?

Thanks,
Deborah Agbedjro