Hi, I just learned about interrupted time series and I’m just wondering how one would control for confounders?
Say I have a dataset of individual patients and I want to compare their monthly rates of getting a certain lab test before and after a policy change.
Questions:
Of course I’d first aggregate the patients to the population level by getting the overall rate per month, but how does one then control for covariates in this scenario, like say age or sex or weight etc? I never see any mention of controlling for covariates in tutorials like one would do in a linear or logistic regression?
Am I correct in thinking ITS can only be used on aggregated data, it can’t be individual level data directly, correct?
One thing I’ve read is taking the covariates and looking at them before and after the interruption to see if they change, but what does that mean in this case, does it mean look at the covariates of the individual patients who contributed to the pre or post periods, or am I supposed to aggregate the covariates like I do with the outcome?
What if I wanted to see if the slopes varied based on subgroups (such as insurance type of the patient)? Would I simply rerun the model for each subgroup separately, or can I add an interaction term?
I’m interested to hear what others say, but in my opinion if you have individual level data there’s no reason to aggregate up just so it fits into the “interrupted time series” framework. The main feature of an ITS regression, namely the potential change in level and/or slope at a particular point in time, can be included in an individual level analysis. You can then easily adjust for individual level confounders, or allow for additional heterogeneity across subgroups of interest.
This got me thinking, with my example I am trying to look at rates of patients doing a lab test per month. In this case since the data can be individual, would I simply label each month of observation for each patient with a 1 if they had a test that month and a 0 if they did not?
There are up to 3 things to parameterize in the model: a discontinuity at the intervention point, smooth before- and after-trends, and seasonal variation. An example analysis is found here.
Without knowing too much about your particular study, that sounds reasonable to me. Each month you have people who are eligible to do this lab test, and you can model the probability that an individual gets the test that month as a function of time relative to the policy change (modeled using the components @f2harrell mentioned) plus other patient characteristics. Your model can include components to capture other aspects of your data as well, such as repeated measurements on individuals or clustering by testing location, etc.
Can you do an interrupted time series or DiD for a binary outcome at the individual level? I would have thought no since individuals with the outcome in the pre-intervention aren’t observed in the post-intervention period. This paper https://pubmed.ncbi.nlm.nih.gov/38147093/ uses diff-in-diff on the individual hospitalization level for outcomes that are essentially binary, but making within subject comparisons pre vs post-intervention doesn’t seem possible to me in this situation.
I think that binary outcomes that represent a “repair and reuse” process, i.e., cases where the outcome is not an absorbing or a permanently health-altering state, are OK to use.
Thanks for your response! So using a pure binary outcome (an outcome that is an absorbing or permanently health-altering state) is not possible for DiD at the individual level? I find the JAMA article I shared above confusing since this seems to be what they did, yet JAMA published it anyways.