Subgroup Interaction

Long time reader, first time poster! I am a clinician trying to improve my statistical knowledge so please forgive me if this is a simplistic question.

I was hoping on some guidance on assessing interactions between groups. Clinical trials and related subgroup analyses often present comparisons between subgroups. In the case of categorical variables such as sex the comparison is made between the groups and a test of interaction performed. This is often also is seen in the literature for continuous variables that are categorised and then compared for interaction. We know there is extensive literature on the issues with dichotomisation/categorisation even if the categorisation is to a clinical used score or staging system.

My first question is, would the more appropriate statistical approach to maximise the power of the data be to keep the variable as continuous and then look at the general interaction of the variable with the intervention of the trial? If this is the better approach, then presumably the variable would need transformation to a normal distribution?



1 Like

Absolutely. We should outlaw the creating of false grouping variables by molesting continuous baseline variables. This is an abominal statistical practice. My BBR notes has lots of details related to this.

1 Like

i think this is an interesting Q because what is done in practice is capricious, and what is statistically efficient is put to one side. Eg i’ve seen a bit of the literature on outcomes for individuals born preterm. Papers show both: maybe the p-values come from a model with linear and quadratic terms and they present HRs for gestational age categories. Occasionally you see a plot of the HR against gestational age (continuous) - it demands a figure, I’ve seen someone dump beta estimates in a table, the reader can’t digest it easily. What about repeated measures? If time points are pre-planned visits does treating time as categorical using a covariance pattern model make sense? Rather than a random coefficients model? In either case we will look at time by treatment interaction but i guess in the latter we have a greater interest in the time effect …