Random effect model for multi center prognostic markers

trumanfrancis · May 28, 2023, 12:36pm

In the Gerds and Kattan Medical Risk Prediction text they mention that " Including the cohort variable as a random effect in a logistic or Cox regression model would make it possible to predict new patients from other centers. However, doing this is not well motivated. There are two problems. The first is that adding a random effect corresponds to conditioning on another predictor variable (the random effect), although this variable has not been observed. Hence, if the random effect turns out to be important this means that an important predictor variable is not available and hence the predicted risk may be systematically too high or too low. The second problem is related to the non-collapsability of logistic regression models and Cox regression models."

If this is accurate what is the appropriate method for dealing with prognostic factor models using data from multiple centers?

f2harrell · May 28, 2023, 3:41pm

Here’s my understanding.

Estimation of random effects requires a lot of data from a lot of centers
Unless using a Bayesian nonparametric random effects distribution, hierarchical models tend to make restrictive normality and single variance assumptions about random effects, and violations of these assumptions can hurt overall inference and invalidate predictions of future centers
Random effects models assumes centers are exchangeable
Sometimes it is more meaningful, useful, and extrapolatable to model center characteristics than actual center attended
Prediction of outcomes at future or non-sampled centers is highly dependent on the centers being exchangeable when the method in the last bullet point is not used
When there is large unexplainable variation in outcomes across centers it’s often better to include random effects than to exclude them
Don’t ever make the mistake of paying a lot of attention to centers while refusing to model all-important patient-specific baseline characteristics such as age and extend of disease

trumanfrancis · May 29, 2023, 12:39am

Thank you. How would you determine if the prognostic model or prognostic factors apply to the average individual or to the average center?

f2harrell · May 29, 2023, 12:53pm

May be best to give an example of what you are referring to.

trumanfrancis · May 29, 2023, 4:13pm

I dont have a specific example but just looked at this paper and was wondering if with random effects models defining the estimand is essential.

f2harrell · May 30, 2023, 2:00am

Random effect model for multi center prognostic markers

Cluster randomized trials require random effects to get the right correlation structure. That’s somewhat different.