Nomogram for Lasso-Cox regression

deb_agbe · November 28, 2019, 1:25am

Dear Frank,

I’m writing to you to seek some advice regarding plotting the nomogram for a Cox-Lasso model and recalibration for a Cox model in general.

We plotted the nomogram by running your nomogram() function of the package ‘rms’ where the model argument was a classical Cox regression cph() object. This Cox regression model was obtained by running cph() on the variables selected by Lasso-Cox (glmnet) and before including it as an argument in the nomogram() function, we replaced the Cox coefficients and normalized linear predictors with the Lasso coefficients and normalized linear predictors.

It resulted that the nomogram survival probabilities for Cox-Lasso at a certain time point were very different from the survival probabilities computed by ‘predictProb.coxnet()’ in R (package c060) at the same time point.

What do you think it is happening? In order to estimate the survival probabilities we need to estimate the baseline hazard. Maybe the baseline hazard estimated by Lasso-Cox is different from the baseline hazard estimated by the nomogram() function because the cph() model (argument for the nomogram) is actually a Cox regression where we just manually changed the Cox estimated coefficients and linear predictors with the Lasso coefficients and linear predictors.

Please, it would be wonderful if you could share your thoughts about it.

I went through the lines of the source code for the nomogram() function online, but some lines were incomplete (e.g. rmsArgs(substitute(list(…))) …), so I wanted to ask you if possible to give me the source code, to check how survival probabilities are computed and compare with the function preditProb.coxnet().

My other query regards recalibration for Cox models. I know we cannot recalibrate a semi parametric model like Cox, because the baseline hazard is not parametric and so calibration in the large cannot be estimated, but only the calibration slope.
If we replaced the recalibrated coefficients (using the calibration slope only) to plot the nomogram, are we going to have a wrong nomogram? Why?

Thank you very much for your availability and time!

Best regards,
Deborah

f2harrell · November 28, 2019, 3:23pm

This overall approach is not necessarily recommended since the coefficients are only part of the model fit, and it is hard to insert underlying survival (or hazard) estimates from another function into the rms fit object. Outside of the lasso context there may be some things you can do.