Meta-analyses are rated as the highest level of evidence, for example, in the guidelines of the European Society of Cardiology (Level of evidence A). Currently, I try to figure out the scientific rationale for this high level of trust in meta-analyses for the guidance of medical treatment.
What I found so far is rather critical, for example:
LeLorier J et al. Discrepancies between meta-analyses and subsequent large randomized, controlled trials. N Engl J Med. 1997 Aug 21;337(8):536-42.
Packer M. Are Meta-Analyses a Form of Medical Fake News? Thoughts About How They Should Contribute to Medical Science and Practice. Circulation. 2017 Nov 28;136(22):2097-2099.
What are your thoughts on this? I am also very interested in paper on this issue. Please recommend what you think is informative. A review of the history of meta-analyses would also be nice.
agree. The EBM people, and books like ben goldacre’s, have made a generation of new docs susceptible to meta-analyses, reflexively i believe. The meta-analysis evidence cannot transcend the studies it is based on, thus how can it be atop the evidence hierarchy? especially when it mixes good quality evidence with bad. I’d ignore the packer paper, those journals publish anything by the “renowned” MD, eg you have MDs telling us about bayes factors as though it’s a fancy new idea etc. The discussion re meta-analysis in medicine has been running for over 20 years, ie you can see the same things said in papers back then when the consensus was: the large RCT is supreme and meta-analysis is a hodge-podge and thus doesnt do what it intends to do ie eradicate ambiguity. I think the magnesium trials were a good example or the homeopathy meta-analysis which appeared in the lancet. That was in 2000. These old discussions about p-values, meta-analysis are constantly renewed, the consensus shifts slightly without anything new being added to the discussion, i’d just ignore the latest fashionable opinion delivered from on high …
edit: very important to distinguish between meta-analysis done in industry and that done in academia. In industry it can be prospective, and they have access to patient level data, and the data quality is very high, they know all relevant trials (because they likely funded them) etc. This is quite different thing to eg a cochrane meta-analysis of a couple of studies of questionable quality
If we look at this from a strict, decision theoretic POV, a credible meta-analysis will save us the cost of doing further experiments. This is most likely to be true when:
We can confidently claim that the samples from each of the individual studies are sufficiently homogeneous that allow for valid statistical combination. Heterogeneity is a large problem with meta-analyses.
The effect size reported in the individual studies are uncontrovertial, can be calculated based on access to the individual data, or via theoretically justifiable transformations of the reported effect size.
Publication bias is accounted for.
So in this sense, a meta-analysis is not “higher” than any particular RCT. And when we get into the problems of retrospective meta-analyses based on literature reviews, JAMA considers them observational studies. 
Berlin JA, Golub RM.Meta-analysis as Evidence: Building a Better Pyramid.JAMA. 2014;312(6):603–606. (link)
i glanced at the packer paper. Isn’t it ironic that his review is a kind of meta-analysis of opinion and his cynicism could equally be aimed at himself eg “the author does no original work”. I doubt there’s a single sentence in that paper that hasn’t been said before, likely by a statistician over 20 years ago
As a non-statistician who has closely followed some high-profile meta-analysis debacles in medicine, my impression (like that of pmbrown above) is that pre-planning a meta-analysis (i.e., stating up front that you plan to conduct several trials of similar design with the express goal of being able to eventually combine them in a meta-analysis) is an important way to increase its credibility. Pre-planning a meta-analysis ensures that the included studies won’t be cherry-picked to support authors’ preconceived ideas and that apples will be compared with apples. In contrast, it seems that post-hoc meta-analyses (i.e, a bunch of potentially heterogeneous studies statistically mashed together as an afterthought to try to convince an audience about either efficacy or safety) probably should be viewed more as observational exercises, rather than experiments.
it’s an interesting Q then how the consultant statistician fends off requests for a meta-analysis. When i have declined (eg a meta-analysis of 2 studies) they have done it themselves using online click-and-point calculators. The eagerness to publish drives this + a belief that meta-analysis is fundamentally a good thing and these online calculators must be made available to every novice researcher
Thanks for your very interesting comments and thoughts. I also enjoyed reading the editorial cited by R_cubed.
I would be very grateful for more recommendations of scientific papers. Is someone aware of a reference justifying the “Level of evidence A” in guidelines? Any review paper which argues in favor of meta-analyses? So far, I read only papers criticizing meta-analyses. But I want to be open-minded.
I’ve collected a large number of papers on meta-analysis in this thread. But I especially found the following articles (2 by Stephen Senn) helpful understanding the issues.
That will place the following paper in a better context:
The main things I’ve gotten from them:
Retrospective meta-analysis is a very limited technique if we desire to come up with an aggregate effect. It almost certainly does not deserve to be at the top of any “evidence pyramid.”
If the outcome does not have an interval or ratio units of measure, it is preferable to use the log odds as the measure of effect in most cases. Meta-analyses that aggregate standardized effects are not likely to be reliable.
There have been number of enlightening threads here on Data Methods about the real challenges of evidence synthesis and meta-analysis that you won’t find in the journals. This comment (as well as the entire thread) stands out as one of them:
Where does that leave the practical clinician who would very much like to learn from the experience of others? There hasn’t been much written in an accessible fashion on this.
After study of a Sander Greenland paper , I started to think about p value aggregation techniques described in table 12.2.a of .
The classical interpretation of p value aggregation methods does leave a lot to be desired. But the relationship between p values and Bayes factors, described in  could provide a very useful cognitive technique to help make the qualitative discussions about directions of effect, a bit more rigorous.
I haven’t seen a Bayesian p value meta-analysis technique explicitly described, but I believe I have derived one based on a number of papers I’ve read (some of them I found on this discussion board!).
I really should write it up in a separate post just to make sure I am not missing something obvious.
Sander Greenland, Invited Commentary: The Need for Cognitive Science in Methodology, American Journal of Epidemiology, Volume 186, Issue 6, 15 September 2017, Pages 639–645, (link)
Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions, Ch. 12, version 6.0 (updated July 2019). Cochrane, 2019. (link)
Goodman, S.Toward evidence-based medical statistics. 1. The P value fallacy and 2. Bayes Factors. Ann Intern Med.1999 Jun 15;130(12):995-1004. (PDF)