How do we count number of variables to compile with the 15:1 EVP rule?

Hi Professor Harrel (and other colleagues)

I am new to survival analysis. I have been lucky. I previously emerged desired models without using rcs()… Suppose I use four knots as an option of rcs(). Does the variable count as one variables or four variables in this case?

Any possible help is very much appreicated.

Best wishes


Four knots in a restricted cubic spline gives you 2 nonlinear terms so there are 3 terms in all. In terms of overfitting this counts almost the same as including 3 (linear) variables to compare against the effective sample size (number of observed events).


Many thanks Professor Harrell.

In this case, if I form an interaction term of a continuous variable with a categorical variable in the formula, then the additional variables are counted as number of levels - 1, like rcs(). Is the logic right?

Suppose only the interaction terms of some categories are statistically significant. Do we keep all the interaction terms of all the categories?

Thanks in advance.


BTW, I some people believe that Survival Analysis is not different from a linear regression. Effective sample size is determined by number of events is sufficient to make it different from linear regression. Then the event is not a continuous variable.

Many thanks for your help and discussion again.


You should keep all interaction terms or not have any. Yes you count one parameter per level of a categorical variable (after the first, reference, level) and k-1 parameters for an rcs with k knots.