A/B Testing for Information Gain

Johannes_Schwenke · April 2, 2026, 5:29pm

I’d appreciate some thoughts / input on the following:

At our institution we will change the software for routine collection of quality of life data which also causes changes in the schedule for data collection.

I would like to use this change in data collection to not just implement one schedule, but design a small RCT that evaluates different schedules. E.g., if we invite patients to complete a questionnaire every 2, 3, or 6 weeks, which schedule is optimal?

Ideally, we would find a schedule that maximizes the information and minimizes the patient burden. I haven’t found any literature on this. The data will inter alia be used for trials nested in the routine data collection. I worry for example that if we ask patients very frequently, the serial correlation will increase and the effective sample size decrease, while patient burden goes up.

I’ve seen that @f2harrell has done something similar with the original VIOLET data here, in the context of Markov ordinal transition models. That is the using the variance of the treatment parameter to for example calculate the variance of the treatment parameter if we use all available days vs fewer days. I think the effective sample size and relative efficiency are intuitive.

However, in our case we don’t have a treatment, just different spacings of questionnaires. So I thought about

using a baseline variable that we are very sure influences the patient reported outcomes and use it’s variance to calculate ESS and relative efficiency (one model per arm (one model per schedule)).
fit one model per arm without a treatment indicator. Use the parameters to simulate new data and introduce a treatment parameter with known effect (e.g., log(0.8)) in that simulated data. Fit a model with a treatment indicator on these simulated datasets → calculate the variance ratio and ESS that way over many simulations
think of something else entire / simpler?

A further challenge is of course that there are multiple questions (~10). A question on quality of life, a question on nausea, a question on sleep, etc. Not sure whether it’s best to fit a model for each of these and compute 10 ESS / efficiency ratios or just compute a sum score.

I’ve not yet started thinking about how to combine the resulting efficiency gain with a model for patient burden and then make a final decision… Very new territory for me. Any tips would be appreciated very much. Thanks!

f2harrell · April 2, 2026, 8:55pm

If the response variable is semi-continuous and has somewhat of a normal distribution you could fit a unstructured covariance matrix and try to fit several correlation structures to it, then use the fitted structure to estimate efficiencies of certain contrasts. This paper may help: http://dx.doi.org/10.1198/tast.2009.08196