Fitting unrestricted model after LASSO to avoid over-shrinkage?



Hi everyone - I love the new site!

I’ve been experimenting with LASSO lately to develop a model in a relatively small dataset. I came across something in Friedman, Tibshirani, and Hastie’s 2009 Elements of Statistical Learning book that I thought contradicts my interpretation of Frank’s treatment of shrinkage in his RMS book and wondered about your interpretation:

" Section 3.8.5 - Further properties of the Lasso

Regarding the coefficients themselves, the lasso shrinkage causes the estimates of the non-zero coefficients to be biased towards zero, and in general they are not consistent. One approach for reducing this bias is to run the lasso to identify the set of non-zero coefficients, and then fit an un-restricted linear model to the selected set of features. "

I recognize there is evidence of over-shrinkage with lasso, but I worry about losing the benefits of coefficient shrinkage with the above suggestion. The benefits seem to outweigh the risk of overshrinkage in most simulations I’ve seen.

Thanks for considering,


You can only make the statement “fit an un-restrited linear model” if you don’t care whatsoever in demonstrating absolute accuracy of the resulting model though an unbiased high-resolution calibration curve. The sole reason for lasso is to penalize selected variables’ effects for the biased selection process. If we were going to trust unpenalized estimates we never needed the lasso to select variables for us.


That was my understanding. Thanks!


You may want to take a look at the relaxed lasso, a two step procedure with lasso in the first step and a touch of shrinkage in the second step. The procedure described in your post is a special case of the relaxed lasso with the second-step shrinkage set to zero.

Definitely agree with Frank that you will still want to generate an optimism-corrected calibration curve.


Unfortunately haven’t seen easy way to do (like Frank’s calibrate function in rms). Can program the bootstrap myself. Maybe I’ll adapt Frank’s calibrate code.


What does the second stage do? I fear it lacks sufficient shrinkage, and begs the question of what the first stage was supposed to do.