# Optimism-correction for Lasso-Cox

Dear All,

I am trying to predict resistance to treatment with a Lasso-Cox model trained on 1435 patients follow for 10 years, 181 events and 33 variables (both continuous and categorical for a total of approximately 50 covariates in the model if we count the dummies).

I have a question about bootstrap optimism-correction for calibration with 1000 bootstrap samples for the developed Lasso-Cox model.

My Lasso Cox has an apparent C-index of 0.66 and an apparent calibration slope of 1.73.

The optimism-corrected estimates are:

C-index_corr = 0.66 - 0.15 = 0.51

Calibration slope_corr = 1.73 - 1.77 = -0.04

I understand that the estimate of the calibration slope has such a large optimism because Lasso-Cox is a biased estimator. So, how would you interpret this result? Is it sensible to assess calibration in this case? Would you re-calibrate the model?

Furthermore, a larger optimism-correction was observed in a random survival forest for both performance measure using the same data. This led me to this additional question: is optimism-correction valid for machine learning methods?

May it be that I am fitting machine learning methods to small sample size relative to the number of features, which produce unstable models (see for example van der Ploeg, T., Austin, P.C. & Steyerberg, E.W. BMC Med Res Methodol (2014) 14: 137).

Thanks,
Deborah Agbedjro

Deborah,

Lasso and elastic net are penalized maximum likelihood methods and regression models, not machine learning. More about that picky point here. This relates directly to the Ploeg et al paperâ€™s relevance here. That paper deals with unpenalized regression and non-regression machine learning algorithms.

A slightly extreme but correct position is that penalized regression is used so that you donâ€™t have to worry about overfitting. If you penalize correctly, the model is underfitted by the amount you expect it to be overfitted. So the apparent calibration slope that is > 1 is not of much interest. The cross-validated slope (say 100 repeats of 10-fold cross-validation with the penalty parameter recomputed and coefficients estimated 1000 times) is what is important, and you expect that to come out to nearly 1.0. That being said, I have less experience with extreme cases where the predictive information in the data may be close to zero. But before going further I recommend checking that the bootstrap worked in your case, by running the above mentioned cross-validation procedure. See also this.

In parallel with all this I recommend testing whether there is a predictive signal present by computing the first 5 principal components of the predictors, putting them in an unpenalized model, and getting a chunk test with 5 d.f.

Another issue is that I expect lasso to be unreliable in terms of selecting the â€śrightâ€ť predictors, and a more fruitful approach will be data reduction followed by either penalized or unpenalized Cox regression (quadratic penalty/ridge regression). This will be more stable and interpretable and will operate within the confines of your data richness. With 181 events you might reduce the problem using data reduction (unsupervised learning) down to 10 dimensions/summary scores.

4 Likes

Thank you very much Frank,

I have done optimism correction also through repeated cross-validation as you suggested, however, the correction was much smaller. In the extreme case you correct through bootstrap and through repeated cross-validation and you obtain two corrected estimates for the C-index of 0.65 and 0.75 respectively, which estimate would you trust?
What do you think about nested cross-validation as an internal validation method? Would this method suit penalized regression and random forests best?

Furthermore, I realized that when the sample size is smaller, optimism-correction through bootstrap estimates a smaller optimism than repeated cross-validation, as I would expect. Would it be correct to prefer using bootstrap optimism correction when your sample size is small and to say that estimates of optimism should be equivalent if the sample size is large?

Finally, do you have any reference for the method you proposed which combines principal components analysis and ridge regression?

Kind regards,
Deborah

1 Like

Hi Deborah - I have a double teaching load this month and unfortunately donâ€™t have time to delve into those excellent questions. I would just add briefly that Iâ€™d trust 100 repeats of 10-fold CV the most, make sure that all supervised learning steps are repeated for each of the 1000 model fits. But the bootstrap is better for exposing the arbitrariness of any feature selection that is also being used.

2 Likes

Dear Frank,

When you can, I would like to know your opinion about the following:

1. What do you think about nested cross-validation as an internal validation method? Would this method suit penalized regression and random forests better than repeated cross validation optimism correction? I suppose they give similar results, but please correct me if Iâ€™m wrong.

2. How would you interpret a negative optimism-corrected calibration slope? For example, for a logistic lasso model, the corrected AUC is 0.68 and the corrected calibration slope is -1.79. Would you correct the lasso coefficients of the selected variables? Is that right to say that before re-calibration the model returns inverse predictions?

3. Can the apparent AUC change slightly after recalibration? Looking at the maths, it should not be the case, but with my model this is happening: it changes from 0.79 to 0.78. Also, does it make sense to report the AUC confidence intervals when the model is regularised regression?

4. Finally, do you have any reference for the method you proposed which combines principal components analysis and ridge regression?

I thank you very much for your valuable support and for your time.

Kind regards,
Deborah

Iâ€™m not sure what you mean by â€śrepeated cross validation optimism correctionâ€ť. Iâ€™m only aware of one way of doing C-V. Concerning the negative slope which validation method yielded that? I wouldnâ€™t worry about 0.79 to 0.78.

A reason for doing penalized regression is so that you donâ€™t have to validate. But this depends on a wise choice of the penalty parameter(s). How did you choose the lasso penalty?

Sorry I donâ€™t have time for more.

2 Likes

Dear Frank,