I am a PhD student working with population-based healthcare data. My main analysis for thesis is an Interrupted Time Series Analysis using the segmented (or piecewise) regression. The interruption is the date when COVID-19 related physical distancing measures were implemented in hospitals, and I am looking for various outcomes related to healthcare. Typically, I will have patient demographics and clinical characteristics that I will need to adjust for in the model.
I recently took a course on single ITSA, and ITSA with a control group, but I am having a hard time to code for ITSA where I have multiple covariates (and a lot of this struggle is because I am relatively new to R). As an example, I am copying a blurb of a SEER-based dataset that I have been practicing on. In this dataset, I have age, race, and income as covariates, and tumor size as an outcome. I have month and year as time variables; from April 2008 to March 2010, i.e., 24 months. For this example, let’s suppose the interruption occurs on March 2009.
I would greatly appreciate any help with coding, or any other suggestion to approach this problem. Here is the google docs link to the practice dataset.
I have an interest in interrupted time series models and am learning them too. I understand your study questions a little bit, but I want to understand your study design. Are you planning to do a ITS with a control ? If that is the plan, then I think that you could use a tool like propensity scores to match the two groups - study and control for the variables that you want to use in the model. Then the ITS model does not need any covariates. It would be two groups, study and control which are matched on covariates and then compared using ITS to see if the slopes are different pre and post intervention.
This is what I think can be done. Would welcome ideas from other researchers.
Professor Rob Hyndman has excellent resources on time series in R.
He also has a book which is available free on his webpage.
All the best,
Thank you so much for taking your time to comment! That’s a great idea, but, unfortunately, I do not have a control group.
The actual study is a retrospective population-based cohort study (using linked data) comprising of a population of patients with cancer. All patients will be followed from their diagnosis to death, i.e., it will be a decedent cohort. The start date for COVID-19 related physical distancing measure will be considered as “interruption”, and various outcomes will be compared pre- and post-interruption.
My doctoral committee has suggested the ITS using segmented regression. I recently took a course on ITS, by Professor Calvin Law (UBC), and that was great! However, it did not talk about a situation when you have covariates.
I was thinking if it’s possible to first run a classic linear regression, adjusted, and then using its output in the ITS analysis. Do you think that makes any sense?
It is important to embed the ITS analysis into the overall statistical model rather that using an ad hoc two-stage procedure. See the example here where an interrupted time series is modeled in a way that includes long-term trends and seasonal trends, for example.