# Pre/post and Longitudinal model when there is no expectation of equal means at baseline

Bench biology is full of pre-post and especially longitudinal designs where the treatment is randomized prior to baseline so there is no a priori expectation of equal mean at baseline. Often the treatment is a knockout or similar construct. An example is a glucose tolerance test compared between a control (WT) and knockout (KO) groups. In this kind of experiment, the response variable (glucose level) is a response at baseline (We don’t think of the Y-variable as a “response” if treatment is randomized at baseline). An extreme example is Figure 3F here.

What is the appropriate longitudinal model for this?

1. I would think if the groups at baseline are sampled from different populations (regardless of the measured difference), that is, if treatment is randomized prior to baseline, the correct model would be a longitudinal model with treatment*time fixed effects and modeling the correlated error due to subject. So not including baseline as a covariate but as the first time in the reponse.
2. I would not think any ANCOVA-like model longitudinal model (y ~ baseline + time + treatment:time) would be correct specification because this would inflate the effects estimated by the interaction coefficients. Again – see fig 3F in the linked paper which shows almost parallel behavior in the glucose level over the whole time period (from 0 to 120 minutes) but an ANCOVA longitudinal model estimates what-I-think are overlly large treatment effects (the interaction coefficients) at post-baseline time points

Can someone point to any texts/tutorials on this? examples with R code?

1 Like

I would have thought the motivation for drastic interventions like knocking out genes in experimental animals is to put bench science in the realm of Platt’s Strong Inference, where statistical inference is beside the point altogether.

One would think…but the field has gone in the opposite direction – a typical bench bio paper has hundreds of null-hypothesis statistical tests. Regardless, I’m really asking this about estimating the effects (coefficients of the model or contrasts) with a CI (so, not looking for BMRS/Stan models, at least at this point).

1 Like

Panel f in Fig 3 of the Furuyama &al (2019) paper you linked does not seem to be what you meant to illustrate your concerns. Wrong link? Posting a screenshot might help clarify your question.

Given these situations, isn’t a more likely rmANCOVA model y ~ baseline + treatment:baseline + time + treatment + treatment:time ? A classic case where the model specified in point 2 leads to spurious results happens with milk yields in dairy cows, where the slope associated with the covariate is not constant between primiparous and multiparous animals. Failure to accommodate this can easily lead to inversion of responses. It could also lead to the magnification of treatment effects, depending on the slopes within each treatment. ANCOVA is a great tool, but so is a flamethrower.