Can confounders be controlled for in an Interrupted time series?

Hi, I just learned about interrupted time series and I’m just wondering how one would control for confounders?

Say I have a dataset of individual patients and I want to compare their monthly rates of getting a certain lab test before and after a policy change.


  1. Of course I’d first aggregate the patients to the population level by getting the overall rate per month, but how does one then control for covariates in this scenario, like say age or sex or weight etc? I never see any mention of controlling for covariates in tutorials like one would do in a linear or logistic regression?

  2. Am I correct in thinking ITS can only be used on aggregated data, it can’t be individual level data directly, correct?

  3. One thing I’ve read is taking the covariates and looking at them before and after the interruption to see if they change, but what does that mean in this case, does it mean look at the covariates of the individual patients who contributed to the pre or post periods, or am I supposed to aggregate the covariates like I do with the outcome?

  4. What if I wanted to see if the slopes varied based on subgroups (such as insurance type of the patient)? Would I simply rerun the model for each subgroup separately, or can I add an interaction term?

I’m interested to hear what others say, but in my opinion if you have individual level data there’s no reason to aggregate up just so it fits into the “interrupted time series” framework. The main feature of an ITS regression, namely the potential change in level and/or slope at a particular point in time, can be included in an individual level analysis. You can then easily adjust for individual level confounders, or allow for additional heterogeneity across subgroups of interest.


Hi thank you for your reply! I found this paper (Time After Time: Difference-in-Differences and Interrupted Time Series Models in SAS) that supports what you say about using individual level data, it does seem this is possible with interrupted time series models and aggregation isn’t necessary!

This got me thinking, with my example I am trying to look at rates of patients doing a lab test per month. In this case since the data can be individual, would I simply label each month of observation for each patient with a 1 if they had a test that month and a 0 if they did not?

There are up to 3 things to parameterize in the model: a discontinuity at the intervention point, smooth before- and after-trends, and seasonal variation. An example analysis is found here.

Without knowing too much about your particular study, that sounds reasonable to me. Each month you have people who are eligible to do this lab test, and you can model the probability that an individual gets the test that month as a function of time relative to the policy change (modeled using the components @f2harrell mentioned) plus other patient characteristics. Your model can include components to capture other aspects of your data as well, such as repeated measurements on individuals or clustering by testing location, etc.