In epidemiological cohort studies, it is common to exclude individuals who experience early onset of the disease.
For example, in a cohort study examining the association between migraines and the risk of stroke over the next decade, many researchers prefer to exclude individuals who develop stroke within the first one or two years of baseline. This is because early onset cases may have developed migraines as a result of a stroke already starting to develop (stroke → migraine), while the study aims to explore the relationship between migraines and subsequent stroke (migraine → stroke).
Another perspective is that excluding early onset cases allows for the analysis of long-term relationships (conversely, if the focus is on short-term relationships, long-term cases would be excluded). Some people also claim that excluding individuals with early onset can lead to more robust results.
I would like to know how statisticians view this approach. Is it appropriate, or does it depend on the hypothesis? If someone strongly believe that there is a possibility of reverse causality, is it acceptable to take this approach?