Good evening, datamethods world! I’ve been twisting myself in knots over this one and could use some advice to straighten me out. Sometimes when you sit with a problem for too long you just need another set of eyes looking at your work.
Background/setup: A two-arm, 1:1 randomized trial of 200 individuals, with outcome measurements planned for 30 days and 90 days. The goal is to understand the treatment effect, defined as a difference in mean outcome between groups at 30 days and at 90 days.
The challenge: In reality, there is a good deal of variation in the actual follow-up times. What was planned for 30 days ended up ranging between 15 and 60 days. What was planned for 90 days ended up ranging between 60 and 120 days.
Standard (?) approach: Usually what I see people do is trim out the observations that are “too far” outside the visit window and just treat the time points as visit numbers (i.e., follow-up visit 1 vs. follow-up visit 2).
Possible (?) alternative: To avoid grouping people into buckets that don’t adequately capture their true time post randomization, one possibility seems to be an adjusted GEE model with an indicator for treatment, a restricted cubic spline on time, and a spline-treatment interaction. From this model, I would, in theory, be able to characterize the treatment effect across continuous time – of course with the most precision available at time points where people tended to visit. But of course, the only function of this model to form point and interval estimates at 30 and 90 days, as pre-specified. As first blush, this seems to be a reasonable way to deal with variable follow-up times.
Question: With some reflection, I see some possible cause for concern. Does the proposed alternative compromise randomization? My worry is that this formulation may require follow-up time to be non-informative, an assumption that is not testable in practice. If the kind of subject who follows up at 25 days is just fundamentally different than the kind of subject who follows up at 60 days, it feels as though this ends up creating more problems than it solves. If the time of follow-up is not random, then it seems as if we are estimating the (adjusted) difference in mean outcome between groups among folks who happen to follow up at 30 days–a condition that may induce systematic differences between the two randomization groups being compared because it’s post-randomization. If the follow up times are truly random, then my sense is that there’s no cause for concern.
Do you see the tension? What are your thoughts? Am I worried about nothing? Should I abandon this idea, delete this post, and never make mention of it again?
Looking forward to discussing.