Diagnostics for regression models with splines

I have some regression models in which I used splines. Are there any diagnostics i can run on them and based on what assumptions. I’m assuming they will not carry the same assumptions as the multiple linear regression models.

Any literature I can read on this will be much appreciated

There are at least two modes of analysis:

  1. Use AIC, R^2, and other measures to assess fit, trying different spline fits or number of knots. This curve fitting mode can result in good estimates but throws out the window the ability to easily do statistical inference or compute valid confidence intervals.
  2. Decide on how many parameters the sample size and prior knowledge about the complexity of the relationship dictate. Stick with that fit because if it’s underfit you can’t do much about it (with the sample size limitation) and if it’s overfit the confidence intervals will be properly wide and it may be too late to do anything about it, unless you decide to penalize the whole model.

One of the most important lacks of fit to worry about is a sudden change in the shape of X vs. Y around a value of X for which no knot is near. In that case the sharp change will be rounded out by the spline fit. It’s hard to know what to do about that. Hopefully you would have specified the number of knots to be large enough (when the sample size supports this) to not underfit too much.

1 Like