Optimism-correction for Lasso-Cox

deb_agbe · January 30, 2019, 11:48am

Dear All,

I am trying to predict resistance to treatment with a Lasso-Cox model trained on 1435 patients follow for 10 years, 181 events and 33 variables (both continuous and categorical for a total of approximately 50 covariates in the model if we count the dummies).

I have a question about bootstrap optimism-correction for calibration with 1000 bootstrap samples for the developed Lasso-Cox model.

My Lasso Cox has an apparent C-index of 0.66 and an apparent calibration slope of 1.73.

The optimism-corrected estimates are:

C-index_corr = 0.66 - 0.15 = 0.51

Calibration slope_corr = 1.73 - 1.77 = -0.04

I understand that the estimate of the calibration slope has such a large optimism because Lasso-Cox is a biased estimator. So, how would you interpret this result? Is it sensible to assess calibration in this case? Would you re-calibrate the model?

Furthermore, a larger optimism-correction was observed in a random survival forest for both performance measure using the same data. This led me to this additional question: is optimism-correction valid for machine learning methods?

May it be that I am fitting machine learning methods to small sample size relative to the number of features, which produce unstable models (see for example van der Ploeg, T., Austin, P.C. & Steyerberg, E.W. BMC Med Res Methodol (2014) 14: 137).

Thanks,
Deborah Agbedjro

f2harrell · January 30, 2019, 2:26pm

Deborah,

Lasso and elastic net are penalized maximum likelihood methods and regression models, not machine learning. More about that picky point here. This relates directly to the Ploeg et al paper’s relevance here. That paper deals with unpenalized regression and non-regression machine learning algorithms.

A slightly extreme but correct position is that penalized regression is used so that you don’t have to worry about overfitting. If you penalize correctly, the model is underfitted by the amount you expect it to be overfitted. So the apparent calibration slope that is > 1 is not of much interest. The cross-validated slope (say 100 repeats of 10-fold cross-validation with the penalty parameter recomputed and coefficients estimated 1000 times) is what is important, and you expect that to come out to nearly 1.0. That being said, I have less experience with extreme cases where the predictive information in the data may be close to zero. But before going further I recommend checking that the bootstrap worked in your case, by running the above mentioned cross-validation procedure. See also this.

In parallel with all this I recommend testing whether there is a predictive signal present by computing the first 5 principal components of the predictors, putting them in an unpenalized model, and getting a chunk test with 5 d.f.

Another issue is that I expect lasso to be unreliable in terms of selecting the “right” predictors, and a more fruitful approach will be data reduction followed by either penalized or unpenalized Cox regression (quadratic penalty/ridge regression). This will be more stable and interpretable and will operate within the confines of your data richness. With 181 events you might reduce the problem using data reduction (unsupervised learning) down to 10 dimensions/summary scores.

deb_agbe · February 5, 2019, 4:27pm

Thank you very much Frank,

Your reply is very helpful.

I have done optimism correction also through repeated cross-validation as you suggested, however, the correction was much smaller. In the extreme case you correct through bootstrap and through repeated cross-validation and you obtain two corrected estimates for the C-index of 0.65 and 0.75 respectively, which estimate would you trust?
What do you think about nested cross-validation as an internal validation method? Would this method suit penalized regression and random forests best?

Furthermore, I realized that when the sample size is smaller, optimism-correction through bootstrap estimates a smaller optimism than repeated cross-validation, as I would expect. Would it be correct to prefer using bootstrap optimism correction when your sample size is small and to say that estimates of optimism should be equivalent if the sample size is large?

Finally, do you have any reference for the method you proposed which combines principal components analysis and ridge regression?

Thank you a lot for your patience and advise!
Kind regards,
Deborah

f2harrell · February 5, 2019, 5:23pm

Hi Deborah - I have a double teaching load this month and unfortunately don’t have time to delve into those excellent questions. I would just add briefly that I’d trust 100 repeats of 10-fold CV the most, make sure that all supervised learning steps are repeated for each of the 1000 model fits. But the bootstrap is better for exposing the arbitrariness of any feature selection that is also being used.

deb_agbe · February 25, 2019, 2:19am

Dear Frank,

Thanks for finding the time to reply to my post despite your double teaching load.

When you can, I would like to know your opinion about the following:

What do you think about nested cross-validation as an internal validation method? Would this method suit penalized regression and random forests better than repeated cross validation optimism correction? I suppose they give similar results, but please correct me if I’m wrong.
How would you interpret a negative optimism-corrected calibration slope? For example, for a logistic lasso model, the corrected AUC is 0.68 and the corrected calibration slope is -1.79. Would you correct the lasso coefficients of the selected variables? Is that right to say that before re-calibration the model returns inverse predictions?
Can the apparent AUC change slightly after recalibration? Looking at the maths, it should not be the case, but with my model this is happening: it changes from 0.79 to 0.78. Also, does it make sense to report the AUC confidence intervals when the model is regularised regression?
Finally, do you have any reference for the method you proposed which combines principal components analysis and ridge regression?

I thank you very much for your valuable support and for your time.

Kind regards,
Deborah

f2harrell · February 25, 2019, 3:35am

I’m not sure what you mean by “repeated cross validation optimism correction”. I’m only aware of one way of doing C-V. Concerning the negative slope which validation method yielded that? I wouldn’t worry about 0.79 to 0.78.

A reason for doing penalized regression is so that you don’t have to validate. But this depends on a wise choice of the penalty parameter(s). How did you choose the lasso penalty?

Sorry I don’t have time for more.

deb_agbe · February 26, 2019, 8:25pm

Dear Frank,

Thanks for your reply.

By repeated-cross-validation optimism correction I mean for example 100 repeats of 10-fold CV in order to estimate the optimism. Instead, nested-cross validation does not estimate the optimism, but only the generalization error by averaging the test errors from the outer loop (the inner loop is used for tuning and model selection, M. Stone. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 36(2):111–147, 1974.)

Concerning the negative slope, I obtained it by estimating the optimism through 100-time repeated 10-fold cross validation. Would you interpret it as inverse predictions? would you recalibrate the model?

If I have understood correctly, you would not correct the calibration slope for optimism for penalised regression. Is that correct? The penalty parameter for Lasso was chosen with the 1SE rule in order to reduce the number of noise variables selected. However, a negative corrected calibration slope also came out from the model returned by the best lambda.

Thank you for your help

Kind regards,
Deborah

f2harrell · February 27, 2019, 12:27pm

I don’t have experience with much of that. On whether to check calibration of lasso, you shouldn’t have to if shrinkage was well selected, but what “well” means is too fuzzy right now.