Does excluding early incident cases help eliminate reverse causality?

In epidemiological cohort studies, it is common to exclude individuals who experience early onset of the disease.

For example, in a cohort study examining the association between migraines and the risk of stroke over the next decade, many researchers prefer to exclude individuals who develop stroke within the first one or two years of baseline. This is because early onset cases may have developed migraines as a result of a stroke already starting to develop (stroke → migraine), while the study aims to explore the relationship between migraines and subsequent stroke (migraine → stroke).

Another perspective is that excluding early onset cases allows for the analysis of long-term relationships (conversely, if the focus is on short-term relationships, long-term cases would be excluded). Some people also claim that excluding individuals with early onset can lead to more robust results.

I would like to know how statisticians view this approach. Is it appropriate, or does it depend on the hypothesis? If someone strongly believe that there is a possibility of reverse causality, is it acceptable to take this approach?

Hi Jiaqi

What is the goal of this observational study?

2 Likes

Hi,@ESMD, thank you for your comment.
The purpose is to evaluate whether the baseline exposure is a risk factor for future disease. Or in another common expression, whether is there an association between exposure and future disease.

I just gave an example before. In the actual paper, for example, this study.
Perceived Level of Life Enjoyment and Risks of Cardiovascular Disease Incidence and Mortality | Circulation

The paper describes it like this

Furthermore, to evaluate reverse causation, the mortality data excluding deaths occurring 1 to 6 years (before the median of follow-up) from baseline were also analyzed.

Hi Jiaqi

Thanks for clarifying. I won’t comment on epidemiologic methods themselves since I’m not qualified to do that. But I will say that I’m very skeptical in general about observational studies that try to establish chicken/egg relationships when certain types of clinical events are involved, especially if researchers are relying on administrative databases.

The accuracy of a lot of observational studies hinges on the accuracy, reliability, and completeness of diagnostic codes entered by doctors into a billing program after they see a patient in the clinic. But if you were to poll 100 doctors and ask them whether they would recommend that a researcher use their diagnostic codes to generate high quality research, most would probably laugh hysterically.

Here’s how a family physician in my country (Canada) would code a typical patient appointment:

The patient brings a list of 6 separate health concerns to a 15 minute appointment (this is NOT unusual in family medicine). Those concerns include:

  1. Follow-up of congestive heart failure;
  2. Follow-up of type 2 diabetes;
  3. New urinary frequency;
  4. Back pain that has been occurring intermittently for the past 15 years;
  5. Forgetfulness noticed for the first time about 3 years ago by a spouse, but never mentioned to the doctor until now, since the spouse thinks it’s getting worse;
  6. Headaches that started about 2 years ago, but which had been manageable with Tylenol until the past 3-4 months.

At the end of this visit, the physician will pick ONE diagnostic code to apply to the whole visit. In Canada, the doctor doesn’t get any financial bonus for coding all 6 issues addressed during the appointment. The doctor gets paid about $37 by the government whether he/she addresses 2 issues or 8 issues in a visit (sad, I know…). Since there’s no extra pay for adding more diagnostic codes to the billing, we just randomly pick one of the six problems discussed during the visit when entering the diagnostic code.

As you can see, diagnostic codes (at least in my country) are only a very crude reflection of the issues actually discussed during a patient’s appointment. One physician might choose to enter the “diabetes” diagnostic code; another physician might choose the “congestive heart failure” diagnostic code; another physician might choose the “headache” diagnostic code…You get the picture. So if you’re a researcher, combing through an administrative database in Canada, and you somehow find out that the patient above has a diagnostic code for “stroke” on a certain date, and now you want to find out whether that patient also had a history of migraine headache at some point prior to the stroke, well good luck with that…Not only might a patient with migraines never actually have a diagnostic code for migraine on record (if they were in the habit of raising multiple issues in each visit, the doctor might have ended up always coding for one of the non-migraine issues), but on the off chance that there is a recorded diagnostic code for migraine, the date of that code in the database will often be a very poor indicator of when the migraines actually started (people often wait months or even years before reporting health problems to their doctor).

As an aside, I think a lot of physicians’ skepticism (?cynicism) about observational research stems from a view that people who do this type of research often aren’t really in a position to understand the limitations of the data they’re working with, simply because they don’t work in the health system itself. If you’re the one entering diagnostic codes and you know you do a piss poor job at it, then you’re not going to trust a study whose conclusions hinge on analysis of those codes…

1 Like

Thank you for your reply. I completely agree with your views on observational studies. I know that many people approach observational studies with caution. In fact, for reasons similar to that you mentioned, I also do not have much trust in observational studies.

However, my personal understanding is that, in some sense, we still need observational studies. The key lies in good research design and correctly interpreting its limitations. The issue you mentioned with diagnostic codes, to my knowledge, many people tend to be evasive, looking aside, and then blur the explanation by saying that large samples will eliminate this bias (since this limitation cannot be resolved). In fact, many observational studies I have come across use very vague methods (even most people can not explain why they use this method, they just follow other studies). Some people focus on how to use some “popular statistical methods” to add appeal to their papers. I think it’s important to face this issue head-on. Therefore, I prefer to report research methods in detail to guide people in reasonably understanding the conclusions. This is why I asked this question.

I myself am also involved in observational research. Whether I like it or not, observational studies are very popular because they are much easier to implement compared to interventional studies. My feeling is that there are very few studies without flaws, and all I can do is try our best to do the work properly .

1 Like

I think viewing reverse causation as confounding by preclinical/early disease is helpful. When there’s a small duration between exposure measurement and outcome occurrence (e.g., diagnosis in the first year of follow-up), you would expect that confounding to be strong. As you say, preclinical stroke could cause migraine. But as the time between exposure measurement and the outcome grows, that confounding would get progressively weaker. You would not expect preclinical stroke 5 years before diagnosis to have as strong an effect on migraine.

If that’s your presumed model of reverse causation, one way to assess it would be to plot the hazard function over time in each group. If there’s a large gap (i.e. hazard ratio) early, which progressively shrinks over follow-up, that pattern would be highly compatible with the reverse causation hypothesis.

Looking at the hazard function seems more informative to me than excluding cases before some threshold.

2 Likes

One problem that comes to mind (though I’m not sure to what extent this extends to cohort studies) is that excluding participants who experience an early event in an RCT can typically produce some pretty bad imbalances between the two arms.

For example, suppose you had a study with no element of reverse causality whatsoever because you were sure that no one had (or was developing) a stroke at baseline. Suppose also that the 2 groups were perfectly “balanced” at baseline with respect to all risk factors for stroke. For simplicity, I’ll refer to migraine as a “treatment”:

  1. If you start with 100 patients in arms A (no migraine) and B (migraine) who were otherwise similar in every respect.
  2. And treatment B (migraine) substantially increases the risk of stroke (say by 5x at 1 year)
  3. Then, by the first year of the study, if 10 patients experienced a stroke in arm A, then 50 (5x10) would have experienced it in arm B.
  4. If you exclude all individuals who experience early events, you’d be left with 90 patients in arm A and 50 in arm B.
  5. The problem is that you now have immense selection bias. The 50 patients left in arm B are very unlike the 90 patients left in arm A because they’re far more “resilient” and likely have a number of protective factors that made them not develop a stroke despite being exposed to a powerful risk factor (migraine).
  6. If you start your analysis with this new set of survivors, you will have introduced a considerable degree of confounding that was not there to begin with (at point 1.)

A useful commentary of a similar issue using the Women’s Health Study can be found here: https://pmc.ncbi.nlm.nih.gov/articles/PMC3653612/#R5

2 Likes

Thank you for your wonderful suggestions. @Ahmed_Sayed, @lachlan, @ESMD,
It reminded me that I once read this paper of the Women’s Health Study. From this paper, I learned that plotting survival curves is indeed necessary. However, as mentioned in the paper, I was taught that observational studies should prioritize multivariable hazard ratios (even if I had plotted survival curves, they would have been discarded as “useless”). At that time, I didn’t understand complex model construction (and I’m still learning now), so I seemed to have overlooked the discussion about adjusted survival curves for confounding.

In summary, plotting a survival curve (and adjusting confounding in observational studies) is necessary. This also helps in determining whether there is a reverse causation. Whether to exclude them needs to be treated with caution