I have a binomial outcome measured pre and post intervention (number of successes in task, number of trials is constant). I’d like to control for the baseline in my model.
It seems to me that using just the number of correct answers at baseline as a predictor is unnatural, as I can hardly expect linearity here. The natural thing to do would be to transform the baseline measurement to log-odds and use that as a predictor, so that a coefficient of 1 would correspond to a perfect match between baseline and post measurement.
This however leads to a problem with handling 0% / 100% success at baseline. A simple thing would be to replace those with 0.5 successes / 0.5 failures, so e.g. when I have 15 trials, I’d treat zero successes with 0.0333 success rate. This however seems somewhat arbitrary and inelegant.
A more elegant way could be to handle this is a measurement error/Berkson style problem, i.e. treat the log-odds at baseline as unknown, but informed by the baseline measurement. That seems almost equivalent to having a fixed effect for each subject, and that feels less than great. Notably, similar measurement error considerations should apply to all models controlling for baseline and I haven’t seen this handled by a measurement error model anywhere, so I assume there is a reason to not do that?
Thanks for any suggestions/literature pointers. I could find that transforming baseline measurements to the model scale was considered beneficial in negative binomial models (https://doi.org/10.1002/bimj.201700103) but nothing about logistic/binomial response.