Statistically Efficient Ways to Quantify Added Predictive Value of New Measurements

Marc_Vila_Forteza · October 26, 2025, 5:39pm

Hi there,
I have a question regarding the use of the partial likelihood ratio statistic to assess variable importance in a Cox Proportional Hazards model.
Specifically, is it appropriate to apply the procedure to the full model—including all predictors and spline-adjusted terms—even if the complete model may be prone to overfitting?
In my case, the final model I intend to use is a more reduced version, with transformed variables (using AVAS, Adaptive Variable Selection) and sparse PCA applied. However, in that scenario, assessing the importance of individual variables becomes much more complex.
Any insights or recommended practices on how to approach this would be greatly appreciated.

Thank you!

f2harrell · October 27, 2025, 1:01pm

Measures will be biased if you remove observed-to-not-be-very-important parameters from the model. R^{2}_\text{adj} accounts for overfitting in an in-sample way.

When the model building uses other steps such as PCs, variable importance may need to be assessed more manually. For example you can delete one variable at a time from the entire analysis (PC or not) and see how much the deviance suffers.

timdisher · October 27, 2025, 1:18pm

I wonder if some of the ideas from this paper would be helpful for you? Specifically how they build a reference model based on a PCA style reduction procedure (different from yours) but then do the projected prediction feature selection/variable importance based on the original features?

f2harrell · October 27, 2025, 8:26pm

That is a key reference. A P.S. to my earlier response: if you use sparse PCA you can compute importance of clusters of variables. This solves the problem of individual collinear variables competing with each other.

Marc_Vila_Forteza · October 28, 2025, 1:52pm

Thank you for your kind responses Both suggestions are useful to tackle the problem. Assessing variable importance is always challenging, and it becomes more difficult (at least with my narrow knowledge) when multiple transformations arise… I will try use both approaches, to avoid that potential collinearities bias the results.

arthur_albuquerque · December 19, 2025, 6:01pm

The link is broken. Found this one but it requests a password. What paper is it?

f2harrell · December 19, 2025, 6:09pm

I correct the link and will direct message you the pw.