I am making a prediction model using individual participant data from 14 different studies. We have selected Age, BMI, and two clinical variables (hormone level and ultrasound result) as predictors for our binary outcome.
We used the psfmi package in R https://rdrr.io/cran/psfmi/src/R/psfmi_mm.R with the psfmi_mm function (“Pooling and Predictor selection function for multilevel
#’ models in multiply imputed datasets”) using random effects for our variable “study”.
Now we want to center around the mean, but we do not completely understand how to do this and how this works in a prediction model.
As far as I understood, first you subtract the grand mean (= mean of total database) per person. For instance if the grand mean in my study is 33 years and I have a patient A in study 1 of 36 years and patient B in study 2 of 36 years, for both of them the new variable AgeGM will be 3. Then, you calculate the new averages per study. For instance if the average age in study 1 (a study with younger women) was 31 years old, the study center mean will be -2 for study 1. The same for study two with older woman: if the average was 40 years, then the study center mean is 4. Now I want to calculate the new variable per patient per study, I am not sure however if I have to subtract or add the study mean. For instance for woman A it would be 3 + -2 = 1. For woman B it would be 3 + 4 = 7. Is this correct?
Second, if I want to make a prediction model, I have read that I should incorporate 2 new variables: the new value per person (so for person A 1 and for person B 7) and a variable with the new average per study (so for person A -2 and for person 2 4; and for every one in study 2 it would be 4). However, if I want to incorporate the model in a new center, we do not know the study center average… How do we handle this?