Optimism Correction after LASSO in clinical prediction models

I am developing and internally validating a clinical prediction model. I want to combine LASSO and bootstrapping to get coefficients adjusted for both overfitting and optimism. However, I can’t find references doing both processes in the same workflow.

During development, I used LASSO to select variables and adjust coefficients for overfitting, yielding a set of \beta_{LASSO} . After shrinkage, I recalibrated the intercept (\alpha_{LASSO}).

After development, during internal validation, I used bootstrap to estimate optimism and the optimism corrected calibration slope (Universal Shrinkage Factor, S). I then multiplied \beta_{LASSO} by S to find coefficients adjusted for optimism and overfitting (\beta_{LASSO + Optimism}). Finally, I recalibrated \alpha_{LASSO} to get a final recalibrated intercept (\alpha_{LASSO + Optimism}).

My final model formula could be represented by:

Y = \alpha_{LASSO + Optimism} + X*\beta_{LASSO + Optimism}

Is this strategy statistically sound? Am I missing any detail?

Thanks

3 Likes

Sounds like an interesting approach, but my perspective is that you just cannot really do a pure data-driven feature selection.

Also:

1 Like

Before LASSO, there was an expert panel to select most important variables. Sorry, forgot to mention it.

Our group generally these days uses the steps outlined here. Manuscript with practical application forthcoming and will post the link here once published to see where we align or diverge.

Your strategy appears generally consistent with these recommendations, particularly regarding penalization and internal validation. The main area of potential debate as far as I can see here is whether applying a bootstrap “universal” shrinkage factor (S) on top of LASSO-shrunk coefficients constitutes “double shrinkage.” You want to use bootstrap to get optimism-corrected performance metrics, but not additional shrinkage on already-penalized coefficients which can move you away from the bias-variance trade-off sweet spot.

1 Like

Is “double shrinkage" bad?

I would avoid it - particularly in small datasets and with using such a post-hoc uniform shrinkage factor. While the LASSO shrinks coefficients differentially, the post-hoc uniform shrinkage factor (if I understood correctly your approach) would treat all coefficients the same way.

1 Like

Interesting. Would you say that one should not perform optimism evalution through bootstrapping whenever applying LASSO for variable selection?

You definitely should perform optimism evaluation. That is consistent with the steps linked above. The issue is post-hoc correcting already-penalized coefficients by multiplying by a further shrinkage factor.

2 Likes

Consistent with Pavlos’ reply, you would only use double shrinkage if you were forced to use lasso and hated lasso. You use lasso because it builds in shrinkage. If you don’t like the shrinkage it does, don’t use lasso.

Expert knowledge should dominate model specification. You used this approach but maybe didn’t take it far enough. Instead of feature selection I would use more unsupervised learning, e.g., sparse PCA as exemplified in two chapters in RMS.

Specifying Bayesian priors on all effects (ridge regression is an empirical version of this) is perhaps a more cogent approach.

Don’t trust lasso for feature selection, as discussed in links from here. It’s just a prediction tool (which I think is the way you are using it).

3 Likes

Thank you all for the useful insight. I still believe LASSO is useful in my case, but will definetly not apply another shrinkage factor on top of it.

And remember the large sample size needed to correctly choose the amount of penalization.

1 Like

I am following the recommendations by Martin et al., 2021. Very interesting paper.

1 Like