In this paper predicting the risk of cruciate ligament rupture based on time of neutering in Labrador Retreivers the following result is presented:
. Risk of CR was increased in dogs
neutered before 12 months of age (OR = 11.38; P
= .01). Neutering before 6 months of age was not
a significant factor (P = .17), nor was neutering be
tween 6 and 12 months of age (OR = 3.11; P = .23).
Overall neutering was also not a risk factor (OR = 1.8;
P = .27).
It makes no sense that there is an increased risk in dogs neutered before 12 months of age while not between at less than 6 months or between 6-12 months. Is this an issue with dichotomizing the month of neutering? Unfortunately the paper provides very little data to evaluate.
cruciaterupture.pdf (648.8 KB)
1 Like
This is a good example of misleading subgroup statistics after arbitrary categorization. Every analysis should start with (1) a high-resolution histogram of the data (here, age) to check regions of support, and (2) a smooth, non-overfitted relationship with uncertainty bands (using splines, nonparametric smoothers, fractional polynomials, etc.). To safeguard interpretations the uncertainty bands should be simultaneous compatibility (confidence) intervals.
1 Like
Can categorizing a continuous variable create Simpsonâs Paradox?
Not sure but I think itâs possible that a form of the paradox could happen. The issue with the so-called paradox is the failure to condition on other relevant variables, and forcing something to be linear is similar to having omitted variables.
2 Likes
That is pretty weird. I donât see a Data Availability statement in the paper, but whatâs the ethos in the veterinary research community? Can you reach out to them to request data?
I think I might but it doesnt seem to be routine.
I would encourage that! Just thinking about this in terms of dummy variables, intuitively it doesnât seem possible that the OR for the sum of 2 dummies (â¤6mos + 6â12mos = â¤12mos) wouldnât be some kind of weighted average of the ORâs for the individual dummies.