This is the methodology used to analyze risk factors on a large datset in veterinary medicine. The authors are well published epidemiologists. If you critique the analysis you are met with disdain. What is the best way to point out the inadequacies of such an approach?
A multivariable logistic regression model was constructed to investigate associations between potential risk factors and fatal MSI. R version 4.4.0 (R Foundation for Statistical Computing) and the Tidyverse package47 were used for all data processing and modelling. During the first stage of model building, univariable logistic regression was applied to each potential risk factor in turn. Factors found to be associated with fatal MSI with a statistical significance at the 80% level (p < 0.2) became candidates for the final multivariable model. Risk factors in continuous form were assessed in both continuous and transformed form—including categorical forms such as quartiles—to identify the form that produced the best model fit, according to the Akaike Information Criterion (AIC).48, 49 Table S1 describes the chosen categorisation for each risk factor.
The final multivariable logistic regression model was built using a stepwise backwards-removing process; the model with the lowest AIC value was selected at each iteration. Model validation included assessing biologically plausible interaction terms and testing for confounding between variables rejected at any stage and those retained in the final model.49 To assess for potential clustering, three mixed-effects logistic regression models were tested, consisting of the final fixed-effects model with horse, trainer and track included as random effects. Goodness of fit of the final model was assessed using the Hosmer–Lemeshow test.50 Post hoc power calculations showed that for continuous variables, the final model had at least 80% power to detect odds ratios of 1.04 or above with 95% confidence. For binary categorical variables, the odds ratio threshold was 1.08.
Okay, I’ll take a bite on your initial question.
Disdain is not how I’d characterize the openness to advocate perspectives nowadays, but it many times started out that way not very long ago. At conferences I’d start with sincere appreciation for the accomplishments of the investigators.
Where I had questions about a concept, a study method, an eligibility requirement, etc, I’d frame it as a question - asked with the intent for me to learn and reconcile with my current understanding – also with the intent to pass on what I’ve learned to the primary stakeholders I represent.
The goal is to get the conversation started; to form a mutually respectful relationship (not to be confused with subservience). Finally, as much as possible, I’d use plain language when framing the question so that people outside the field can follow the implications of the different approaches to the study. i would not call them “inadequacies,” for example. Because that assumes my view is correct when it has not yet been debated.
I read the cited piece “Novel risk factors associated with fatal musculoskeletal injury in Thoroughbreds in North American racing (2009–2023).” How would you explain to the owners what a different methodology might show in order to reduce the mortality of these creatures? Is there a way to explain the competing methodologies by way of analogy?
One of many issues… they claim that horses that begin racing at 2 have less chance of a breakdown than horses that make their first start at 3.
election bias creates the illusion of causality:
We only observe horses that survive to race (conditioning on the collider)
Horses with early musculoskeletal disease (EMD) are less likely to survive to race at all
Those with more severe EMD that do survive might start later (at age 3)
The hidden mechanism:
Horses healthy enough to start at age 2 are likely those with minimal EMD
Horses that can only start at age 3 may include those that needed extra time to recover from mild-to-moderate EMD
EMD directly affects both career longevity and injury risk (shown by the direct arrow from EMD to Outcomes)
The spurious association:
When we look at the data, we see that horses starting at 2 have better outcomes
But this isn’t because starting at 2 causes better outcomes
It’s because horses with less EMD are more likely to both start at 2 AND have better outcomes
This is a classic example of “healthy worker bias” or “healthy starter bias” - the horses that start earlier are systematically different (healthier) than those that start later. The apparent protective effect of early racing is actually just reflecting the underlying health status that allowed early racing in the first place.
The DAG makes this clear by showing that EMD affects both the likelihood of early racing and the outcomes, creating a spurious association between start age and outcomes when we condition on survival to race.
Good narrative explanation- your argument is clear. I’m not sure about the DAG though- specifically, the “collider” definition (?) Doesn’t a collider in a DAG usually have two arrows leading into it, one from the “exposure” and the other from the “outcome” (?) Maybe an epi person can chime in (?)
I also don’t think survival is a collider and I’m not sure you need to include survival in the DAG at all. It seems like the main issue is that EMD is a confound—it influences both start age and outcomes among racing horses—and you’ve made a strong case for why ignoring EMD could bias results.
The paper itself is somewhat of a “causal salad” numerous variables selected from a large database and mixed together keeping the ingredients that pass a statistical threshold. The example I gave is one of many potential examples. The problem is even if they dont specifically call the variables causal, the public and press interprets them as such. Would you consider this an example of the table 2 fallacy?
Arguably, the most effective way to mitigate the risk of fatal musculoskeletal injury in horses would be to stop forcing them to race…
Moving on…Input from epidemiologists on the high-level conceptual goals/limitations of studies like this would be great. The paper feels confusing since the authors aren’t explicit enough about their goals (i.e., ?description; ?prediction; ?explanation, ?some combination of these). They observed a decreased incidence of fatal Musculoskeletal Injury starting in the mid-2010s, around the time that multiple regulatory changes around horse racing were implemented. They would like to explain which factors might have contributed most strongly to this reduced incidence. But they also hope to be able to develop a way to predict which horses will be at increased risk, given each horse’s particular constellation of risk factors. Unfortunately, it’s often very challenging (or impossible) in medicine to predict rare outcomes that have multifactorial etiologies (e.g., suicide).
In short, it seems important, as a first step, to clarify what the authors of the horse study are trying to achieve. They seem to be aiming to identify “risk factors” which might allow risk prediction for individual horses. Presumably, the ultimate hope is that certain mitigation measures can be implemented at the level of individual horses (e.g., deciding not to race).
I may be wrong here, but I do not think that the authors were interested in causality. There were simply building a predictive model based on risk factors that predict fatal MSI. They say in the discussion that “These results allow a refinement of possible risk profiles for individual horses, potentially enabling those horses who are at greatest risk of fatal MSI to be identified in advance.” So I would conclude that they were simply building models that had the most parsimonious independent predictors of fatal MSI and therefore confounding, selection bias and DAGs might not be relevant to discussion of this paper. If that is their goal, why do we need to critique the paper? Yes, they could have gone further and created a risk score or similar but then that is up to them.
This seems like the crux of the problem. If a professor of epidemiology isn’t confident he understands what the authors were trying to achieve, the authors can, at the very least, be faulted for muddled messaging. And, while muddled messaging doesn’t necessarily reflect muddled thinking, it acts as a bit of a red flag…
Its not just about methodology when I say I may be wrong, it is about second-guessing their intentions. But clearly that paper does not have any strong suggestion that they had a causal intent (again I may be wrong).
I am not sure this follows: If I stage a person with cancer as stage I and another as stage IV the latter is identified as more likely to die from progression than the former but staging involves predictive, not causal, modeling.