I have 4 variables which I am investigating in relation to an outcome. I have applied restricted cubic spline regression first for each. I have compared the RCS models to nested linear models via likelihood ratio test to test for non-linearity. 3/4 are nonlinearly associated (after adjusting for age and gender).
Now I would like to compare which variable is more predictive for the outcome, when all variables are included in the model. Since some of them correlate, I thought of using elastic net to account for collinearity.
Q1. Do you think this is an appropriate way to assess that or can you think of any better method ?
Q2. Would it be right to use elastic net even if most of variables are nonlinearly associated with the outcome ?
Tibshirani and collegues have expanded the lasso to include nonlinear effects. Not sure if elastic net has been generalized.
I can think of a potentially easier method with a few advantages over elastic net. You could fit the model using a Bayesian method, using informed priors to reduce overfitting & multi-collinearity.
For instance, in R package ‘brms’ you could fit the model like:
Y ~ s(a) + s(b) + s(c) + s(d)
Where a, b, c, d are your predictors. s() uses penalised thin plate splines by default (I’m not sure whether restricted cubic splines are available). Using priors that pull parameters towards contextually reasonable ranges will go some way towards reducing overfitting problems.
An advantage of the Bayesian approach is that it readily provide standard errors/intervals for parameters and predictions (not at all straightforward with elastic net).