Interim Analysis Plan for Observational Studies

Our team is in the process of writing a protocol for an observational, longitudinal study of early diabetes. Our primary goal is to examine the association of an exposure with longitudinal hemoglobin A1c (HbA1c) over the course of two years of followup. HbA1c will be measured at 6 followup times, and we are enrolling, to the extent possible, equal numbers of exposed and unexposed. The estimand will be the difference between longitudinal trajectories and will include an exposure by (flexible function of) time interaction. In RCTs an interim analysis plan is standard, and I am seeking feedback regarding the utility of an interim analysis plan for this observational study. Some things to note:

  1. There is no intervention and no randomization to the exposure. To estimate the association we will use longitudinal regression models while adjusting for potential confounders.
  2. Participant followup is 2 years and the funding period is 4 years. This means that by the time we might observe a strong association, we effectively will have enrolled the entire cohort.
  3. because of 1) and 2) I do not think it is realistic to think we would terminate the study or modify the design based on an observed association.

With all that said, I would be interested in feedback regarding the potential utility of an interim analysis plan for this type of study. Are there compelling reasons to have an interim analysis plan (e.g., quality control, precision analysis, other?)?

Thanks for any feedback.

1 Like


An initial question:

What is the intent of the study?

Is it for publication (e.g. peer reviewed manuscript, poster, podium presentation), or for a Real World Experience (RWE) regulatory submission, or for another intention?

I ask because, in some cases, depending upon the target journal/meeting in the first case, they may want to see a more detailed SAP (separate from the statistical methods defined in the protocol), including planned interim analyses and adjustments for multiple testing, missing data handling, etc. even for an observational study. There are some that do not require such detail beyond what is in the protocol.

In the second case, you would almost certainly need a separate, detailed SAP for a RWE regulatory submission, including the details of planned interim analyses.

That all being said, generally speaking, there is great debate about the need for a pre-defined, stand-alone, detailed SAP for observational studies that includes the details of planned interim analyses, as compared to the expectations for an RCT and people come down on both sides of the issue.

In the case of a formally powered RCT, one needs to pre-specify key analyses and methods for defined endpoints, and in the case of interim analyses, how you are going to handle adjustments for multiple testing and the pre-specification of interim stopping rules, if any. Your a priori power/sample size calculations need to take the timing and number of interim analyses into account, since the sample size will be inflated at some level to account for the sequential testing method intended, and you need to deal with the adjustments to the relevant alpha decision thresholds along the way, including for the final analysis.

In many cases, the content of the statistical methods section of the observational study protocol can suffice to provide the details of analyses that are consistent with your primary/secondary goals for the study, etc. Beyond those, it is common that there can be language therein that refers to “preliminary, exploratory analyses” that may be performed from “time to time” while the study is underway (e.g. interim analyses), to support publications, etc, along with other relevant details as apropos.

One key is to clearly note that there will be no decisions made to stop or modify the study based upon the interim analyses, albeit, there may be other motivations to do so (e.g. data quality, new external information, safety, etc.).

The key is to understand and describe the relevant limitations of the study as designed, including the interpretation of the interim and inter-group comparative analyses as they may be presented/published. Since this is not an RCT, as you note, the impacts of varying confounding factors (collected and uncollected), including patient selection bias, are important to consider.

For general comments, since you will be conducting longitudinal analyses with serial measurements at varying timepoints, consider using time as a continuous variable as opposed to a categorical one, as Frank has noted on several occasions here, due to the variability of the exact timing of the follow up contacts around your intended contact windows. You can then generate model based estimates of your parameters of interest at clinically relevant, fixed time points, as needed. Also how you are going to deal with the inevitable declining sample size over time, especially if there is a differential between the groups, as you will lose patients to follow up, early discontinuation and so forth, leading to unbalanced data. For example, mixed effects models can be helpful here and there are some threads in the forum that cover some of the considerations for their use.

1 Like

Thank you. A few things to note in response to your thoughtful response.

  1. The intent of the study is to examine the association of a recent COVID-19 infection on the early course of type II ( and type I separately) diabetes. This is for research that will eventually be published in peer reviewed journals. No regulatory submissions.

  2. In addition to the statistical analysis section in the protocol, we will also provide a more detailed SAP that will describe all pre-specified primary and secondary analyses, as well as how we will handle other analytical challenges, e.g. incomplete followup, missing confounder data, etc.

  3. We will not be stopping the study early or modifying the design based on any interim analyses because main study analysis are pre-specified.

  4. I believe we will conduct other analyses of these data that will address tertiary and quaternary questions. For example, I assume we will write some “interim” papers whose analyses estimate other associations just using baseline data (as questions from the study team arise). These analyses will also be pre-specified, but as yet have not yet been identified.

  5. Indeed we will use a flexible function of continuous time since participants will not show up to visits at precisely the scheduled follow-up time.

Related to 4), my thinking has been that an Interim Analysis Plan section is one that details early looks at primary study analyses that inform early stoppage and/or adaptations to the study design. Am I being being too restrictive in my thinking? What other analyses might be detailed in an Interim Analysis section of a protocol or SAP?

1 Like


Thanks for the follow up information.

A few additional comments to consider:

  1. With respect to any content in the protocol itself pertaining to interim or other exploratory analyses, be cautious as to how much, if any, detail you pre-specify there, as you want to avoid a situation where a protocol amendment is needed because something changes that affects the nature and scope of those analyses once the study is underway. Protocol amendments have time impacts, IRB requirements, cost impacts and implementation impacts, especially if this might be a multi-site study. Given those considerations, you want to minimize the need to go down that path unless there are core changes to the study design itself at some point, and not changes to analyses that today, you think you might do at some point but which, in reality, may become problematic due to unexpected circumstances.

  2. With respect to a stand alone SAP, I would also be cautious there about details beyond your primary and secondary endpoints, ITT/mITT cohort criteria, safety, and perhaps some key exploratory analyses, if you know what those will be now.

  3. In the context of your question about 4, the notion of interim analyses and exploratory analyses, in the setting of an observational study can be interchangeable. As you note, you are not going to modify or stop the study based upon those analyses, but will conduct analyses while the study is underway, thus are exploratory in nature and interim in timing, and may or may not depend upon having specific proportions of patients get to a common study timepoint. At the same time, you will also conduct other analyses that may be planned and/or unplanned at present, once the study is closed, which would be exploratory, but are no longer interim. There are studies that I am involved with where we are still going back on a post hoc basis, to analyze data years after the study actually closed, where those analyses would not have been anticipated previously, thus would not be in the protocol, nor in an SAP if one was created.

  4. I would consider providing a high level framework for the conduct of any exploratory analyses, that will occur from time to time during and after the study, that is flexible enough to cover future analyses that are almost certainly to occur in the near term over the life of your study, but which are not yet known and may only be known once you actually have data to review, initial results in hand, and new questions arise as a result. Thus, focus on describing the likely methods that might be used (univariate, bivariate, Cox regression, Kaplan-Meier, mixed effects models, logistic regression, etc.), alpha levels, confidence intervals and related details, but don’t go down the rabbit hole of trying to exhaustively pre-specify every possible analysis that can you envision and don’t lock yourself in to a workflow that can be overly constraining.

  5. I would avoid enabling the situation where the SAP becomes a “living document” once you have the initial version in place. That is, where every time somebody raises the need for a new analysis, perhaps motivated by reviews of the existing data, the desire to create a new publication/poster, new external information, etc., there is a desire to go back and update the SAP with that newly requested content and the associated detail. Depending upon your funding source, and how any time for those tasks will be paid for out of your operating budget, constantly going back to update the SAP can become an expensive proposition for time, resources and costs over a multi-year time frame, where those funds are better allocated to other tasks.

1 Like