On Twitter, @f2harrell referenced an Andrew Gelman post on this research synthesis of statins:
Evidence-based medicine eats itself in real time (link)
Such a provocative title got me interested. Fortunately, Gelman has a link to the original paper. A quick skim shows they violated most of the statistical guidelines in this thread:
What they did:
- No attempt at meta-analysis.
Blockquote
Because our systematic review involved 3 different drug classes and several different patient populations, we intentionally did not perform a meta-analysis.
That is unfortunate because they had enough studies (35) to do some Bayes/Empirical Bayes modelling, where information from heterogeneous groups can be used to draw conclusions about others.
-
Categorization of continuous variables. This would be another example of “Threshold Science” that @llynn has posted on numerous occasions. They probably didn’t have much choice due to what was reported in the primary studies.
-
Absence of Evidence Fallacy: Mortality and Cardiovascular Disease benefits were classified as “no” if CI includes RR, OR, or HR of 1. A quick scan of the CI in Table 1 shows either very wide (uninformative) intervals, or intervals that are skewed towards benefit, and no attempt to transform them to a common scale.
They did a ‘vote count’ that treated "significant’ results as “positive” and “not significant” results as “negative.” Such a vote count procedure was proven to be incorrect in 1986 by Hedges and Olkin the classic text Statistical Methods for Meta-Analysis (page 3).
Blockquote
Intuitively, if large proportion of studies obtain statistically significant results, then this should be evidence the effect is nonzero. Conversely, if few studies find statistically significant results, the combined evidence would seem to be weak. Contrary to intuition … studies of the properties of [improper] vote counting procedures can be shown to be strongly biased towards the conclusion that the treatment has no effect.
It will be interesting for me to explore the data in this paper and post some ideas on how to plausibly examine it, both from a frequentist and Bayesian POV.