Clinicians blaming evidence syntheses for research waste

I agree, and we need covariate adjusted estimates from RCTs to enable more robust evidence syntheses - something that is rarely reported.
The main focus in RCTs today when it comes to HTE seems to be on subgroup effects (rather than risk magnification which is more important), which in an individual RCT can only be suspected because the vast majority of such observations are actually due to artifacts of the sample and nothing more. The only way then to be sure is to see whether the modifier shows up consistently across studies in a synthesis of the evidence and thus only subgroup effects within syntheses may provide a better level of evidence for this sort of HTE

My complaint was about the dogmatic assertion regarding “hierarchies of evidence” that is clearly false, not about the relative merits of meta-analysis or primary research per se.

Finding work critical of EBM depends on which literature you look at. The BMJ article made it quite clear that “meta-analysis” isn’t an independent method, but a perspective from which to view research reports. Glass himself desired a replacement of “meta-analysis” with something that is now called “fusion learning” or “confidence distributions” in the frequentist stats literature. Nelder’s quote is compatible with the opinions of the BMJ authors and Gene Glass.

Complaints about research “waste” are a direct consequence of a flawed theory of evidence. A flawed theory of evidence will lead to:

  1. ignoring information that should be conditioned upon, leading to studies that should not be done, or
  2. conditioning on false information, causing surprise and controversy in practice, leading to more calls for additional research.

Citing one more thread in this forum alone feels like beating a dead horse, but there are a number of papers here that either discuss decision analysis in a medical context, or study EBM criteria empirically and find it flawed.

From the abstract:

The notion that evidence can be reliably or usefully placed in ‘hierarchies’ is illusory. Rather, decision makers need to exercise judgement about whether (and when) evidence gathered from experimental or observational sources is fit for purpose.

From the abstract:

As EBM became more influential, it was also hijacked to serve agendas different from what it originally aimed for. Influential randomized trials are largely done by and for the benefit of the industry. Meta-analyses and guidelines have become a factory, mostly also serving vested interests.

From the abstract:

The limited predictive validity of the EPC approach to GRADE seems to reflect a mismatch between expected and observed changes in treatment effects as bodies of evidence advance from insufficient to high SOE. In addition, many low or insufficient grades appear to be too strict.

1 Like

“I have always been annoyed when the term “evidence” is dogmatically thrown around by professionals without any particular expertise in statistics, mathematics, or logic, when actual experts are much more nuanced.”

Physicians (presumably the “professionals without any particular expertise…” you refer to above) witness (daily) our patients being bilked out of their life savings by “healthcare providers” who harbour a complete disregard for the standards of evidence you seem to disdain so much. If I had a nickel for every patient I’ve seen who has spent (or is contemplating spending) hundreds or thousands of dollars on BS treatments being promoted in my community (e.g., platelet rich plasma injections, laser to every imaginable body part, steroid injection (of every imaginable body part), cupping, acupuncture, naturopathic treatments,….), I’d be a wealthy woman. Hawking unsupported, non-reimbursed (for good reason) therapies to desperate patients (some of whom can barely afford their groceries) who are in no position to be able to independently assess the validity of efficacy claims, is reprehensible. If my publicly-funded healthcare system has to choose between reimbursing the cost of SGLT-2 inhibitors for all my diabetic patients versus acupuncture for acute lumbar strain, guess where I’d prefer the money is spent?..

It seems important not to allow frustration with the difficulty of obtaining RCT evidence in one’s field to morph into a general resentment/disregard for RCT evidence (or the field of “EBM” or physicians). As noted in this link from another thread, the importance of randomization to demonstrate therapeutic efficacy was appreciated LONG before the “dawn of EBM” in the early 1990s:

https://www.fda.gov/media/110437/download

Acquiring convincing evidence that something “works” is very hard- there’s no way around this fact. Tearing down a method, simply because it is sometimes out of reach, seems unscientific. And scouring the methods literature to find big names who have questioned the importance of randomization (most of whom probably don’t do work that has actual consequences for patients) probably isn’t the most productive way forward. The better way to boost the credibility of therapies offered by a field would probably be to find a way for it to obtain randomized evidence.

At the end of the day, physicians, even though most are “without any particular expertise in statistics, mathematics,…” (“or logic;” really??), arguably have an awful lot more “skin in the game” when making treatment decisions than do the shysters I’ve described above.

4 Likes

Well said and perhaps true for those questioning the importance of EBM and research synthesis

1 Like

We probably have different ideas about what is EBM. In my view EBM is the use of the research literature by physicians as the basis for decision-making in Medicine and therefore requires physicians to have a clear understanding of the science behind clinical research (known as clinical epidemiology which subsumes clinical biostatistics). Thus evidence based practice is the outcome of an appropriately trained physician in clinical epidemiology and this combination constitutes EBM.

Evidence is anything published in the literature and definitely there are hierarchies and these are by design because the designs, by definition, are hierarchical in terms of resistance to bias (and this is not the statistical bias that contributes to the MSE but rather biases that lead to non-causal associations).

The evidence in research synthesis is the highest level of evidence because the scientist that undertakes this has the capability (based on expertise in both research science and the content area) to help provide researchers with sufficient information to assess what contribution any new results can make to the totality of information, and thus permit reliable interpretation of the significance of new research and indeed if and what aspect of new research is needed on a topic. This alone is sufficient justification to move the synthesis to the top level of evidence irrespective of the additional benefits in terms of epidemiological and statistical mitigation of bias. Thus, the method itself is secondary when considering where to place evidence syntheses in an EBM hierarchy of evidence sources.

1 Like

I have problems with agents asserting grandiose claims about terms like “evidence” or “reasoning” absent a sound logical foundation, and then becoming belligerent when I do not concede to their pretense of authority.

RE: physician competence regarding statistical analyses – think carefully about these comments by a mathematician and epidemiologist.

It is no shame to admit ignorance of many topics. I’m no expert in auto mechanics, carpentry, or archaeology, to name a few. But I don’t pretend to be so, and people do not risk blood and/or treasure on my non-existent recommendations.

Contrast this admission of ignorance to the behavior of surgeons in response to a clear error in statistical reasoning being pointed out:

I used to think Sander was exaggerating when he accused JAMA of manslaughter for how they report data, but after the past few years, that indictment might need to be expanded.

EBM was touted as a “revolution”, saving people from arbitrary medical authority.

The problem with revolutions is you end up back where you started. Instead of individual doctors making local decisions (that may or may not be rational), front line doctors are now governed by unaccountable medical bureaucrats, who make claims about “best practice” with no skin in the game, creating a failure point that increases the risk, compared to those “bad” old days.

1 Like

No one is asking them to assume such a role nor are they implicitly in such a role because of their status as clinicians (regardless of what the clinician may believe about his/her research aptitude). However, there will be no sound clinical decision making if the clinician does not understand the basics of research methodology and that is why the EBM movement started - there are only two options:
a) Train clinicians to use the medical literature and thus train them to the level (at least) where epidemiological and biostatistical results are clearly understood
b) Hire a fortune-teller to sit in their clinics

Clearly, no one will endorse b)…

Wow- you’ve got a pretty low opinion of physicians. Sounds like you’ve really got our number and don’t like what you see… :neutral_face:

Dangerous overconfidence is not a problem that’s unique to medicine. It’s present in every field. It seems particularly prominent among those whose study has been self-directed. Such people might not have faced important consequences for errors in their thinking and might not have had their misunderstandings corrected through the formal guidance of a mentor with long-term, immersive applied experience.

On the contrary, I think the vast majority of physicians want to do well by their patients, but economic forces have manipulated them into a position to be unwitting tools of agents who do not necessarily have individual well-being in mind.

Good intentions and academic achievement do not necessarily correlate with independent thinking, or statistical skill, however.

2 Likes

Agree with many of your points. Arguably, the main roles filled by systematic review and meta-analysis in medical-decision making are:

  1. To show the extent to which research in an area does or doesn’t tend to “point in the same direction;”
  2. To take stock of questions that have already been addressed so as to prevent costly re-invention of the wheel;
  3. To help identify knowledge gaps that can guide the design of future studies.

As is true for primary studies, high quality meta-analyses will have had input from people with both subject matter and methodologic expertise. Unfortunately, people aren’t always good at assessing their own expertise.

Problems arise when clinicians over-estimate their statistical/epidemiologic ability and try to conduct research without the help of those with methods expertise. Similarly, an RCT designed solely by a group of statisticians would most likely be useless to clinicians.

Arguably, the problem is significantly worse for observational studies. Since patients aren’t being exposed to an intervention, some observational researchers don’t seem to feel it necessary to solicit input/advice from clinicians when designing their studies, even though the questions they are trying to answer have clinical implications. Reading many syntheses/reviews of observational bodies of evidence, it often becomes painfully clear that nobody, over many years, bothered to take stock (or cared) whether any particular study result had any meaningful impact on patients. Lots of CVs were burnished as money was burned.

When it comes to RCTs, well-conducted evidence reviews often CAN influence practice. I’m thinking back to this really useful publication from 2016:

https://www.ahajournals.org/doi/full/10.1161/CIRCOUTCOMES.116.002901

As an endocrinologist (I think?), you’ll recall that it was around this time that several large RCTs called into question the “glucocentric” view of type 2 diabetes management, with regard to effects on hard clinical outcomes. This approach to diabetes care had become so entrenched in clinical practice that it wasn’t until these types of evidence summaries started appearing that it felt like doctors finally stepped back and re-examined their approach to this disease. Of course, you’ll recall that it was also around this time that we finally started to see some of the newer diabetes medications (e.g., SGLT-2 inhibitors, GLP-1 agonists) demonstrating important benefits on hard outcomes for patients.

Ironically, you’ll also remember that our discovery of the cardiovascular benefits of these new diabetes medicines was “accidental” and stemmed from FDA’s need to respond to a highly controversial meta-analysis of the safety of an older diabetes medicine (rosiglitazone):

https://www.nejm.org/doi/full/10.1056/nejmoa072761

This “post-hoc” meta-analysis (i.e., it was designed AFTER conduct of its component trials) caused MUCH consternation in the clinical community. After rosiglitazone had been heavily promoted to doctors for many years, its cardiovascular safety was suddenly called into question by the “Nissen meta-analysis” (as it came to be known). Many advisory panels were convened to discuss next steps, ultimately culminating in FDA guidances on the conduct of trials for new diabetes medicines and meta-analyses for assessing safety endpoints:

https://www.fda.gov/media/117976/download

In order to prevent recurrence of this type of scenario with future new diabetes drugs, FDA started to require that future diabetes trials be designed in such a way that drug-induced cardiovascular risk could be “capped” at a certain level. This new requirement that trials for new diabetes drugs should be designed to permit capture of a number of cardiovascular events that would be sufficient to “rule out” more than a certain degree of drug-induced CV risk was what ultimately ended up revealing the CV benefits of SGLT-2 inhibitors and GLP-1 agonists (benefits which might not have been identified if not for FDA’s new requirements).

https://www.ahajournals.org/doi/10.1161/CIRCULATIONAHA.119.041022

And so, out of the ashes of one of the biggest debacles in the history of drug safety, emerged, arguably, two of the most important classes of drugs developed in the past several decades. And it all started with a meta-analysis…

2 Likes

So does that make me an “unwitting tool” who can’t think for herself? Friendly suggestion- you might want to tone down the doctor-bashing if you want people to continue to engage with you here…We’re all trying to learn from each other in order to help people. Repeatedly impugning the competence/critical thinking skills of an entire profession isn’t conducive to collegial interaction.

Yes, also an endocrine physician and there are a series of examples in this area e.g. low dose versus high dose radioactive iodine ablation post surgery for DTC. Several meta-analyses preceded the two NEJM trials (2012) and the issue is still not resolved many years later with more syntheses as well as primary studies appearing. What is very necessary now is a tool to determine the exit status of a meta-analysis that is robust and reliable so that once a meta-analysis is tagged “exit” all future trials, studies and syntheses can cease on that question as the cumulative evidence would be considered conclusive. We have been awarded a grant to work this out and hopefully there will be an answer if we can figure this out.

1 Like

If you think pointing out logical and mathematical flaws in the journals and textbooks that you still accept is “doctor bashing”, that says nothing about me. You are still free to point out an error in my reasoning (as an honest scholar would).

The fact is, most clinicians are simply too busy doing patient care or admin duties to also become competent in data analysis. Nor can you make informed, independent decisions when critical data is not published.

When I did clinical care, I was. What is taught in med school or even CE classes is not enough.

What is enough? Just to become an “Associate” of the Society of Actuaries or Casualty Actuary Society (the agents who make sure risk is managed properly) requires most people (with quantitative aptitude) close to 4 years. Certainly, that is overkill for clinicians, but a calculus based math-stat course is the minimum.

This assumes instructors have an adequate understanding of statistics and mathematics. I think the past 100 years indicate they do not.

Bernardo, J. M. (2003). [Reflections on Fourteen Cryptic Issues concerning the Nature of Statistical Inference]: Discussion. International Statistical Review/Revue Internationale de Statistique, 71(2), 307-314.

Established on a solid mathematical basis, Bayesian decision theory provides a privileged platform from which to discuss statistical inference.

When I pointed this out in another thread, this was the reply:

"Bayesian decision making” it’s not very common in med research, as far as I can see. And it is also not very commonly meant in intro statistics books

Contrast this with Senn’s quote from a guest post on Deborah Mayo’s blog:

Before, however, explaining why I disagree with Rocca and Anjum on RCTs, I want to make clear that I agree with much of what they say. I loathe these pyramids of evidence, beloved by some members of the evidence-based movement, which have RCTs at the apex or possibly occupying a second place just underneath meta-analyses of RCTs. In fact, although I am a great fan of RCTs and (usually) of intention to treat analysis, I am convinced that RCTs alone are not enough.

I don’t like arguments from authority, but I’ve cited enough statistical experts that anyone who thinks I’m incorrect is honor-bound to give an explicit logical argument refuting my claim.

Using pre-data design criteria as an ordinal, qualitative measure of validity for an individual study has no basis in mathematics.

Goutis, C., & Casella, G. (1995). Frequentist Post-Data Inference. International Statistical Review / Revue Internationale de Statistique, 63(3), 325–344. https://doi.org/10.2307/1403483

The end result of an experiment is an inference, which is typically made after the data have been seen (a post-data inference). Classical frequency theory has evolved around pre-data inferences, those that can be made in the planning stages of an experiment, before data are collected. Such pre-data inferences are often not reasonable as post-data inferences, leaving a frequentist with no inference conditional on the observed data.

This is why James Berger (who literally wrote the book on statistical decision theory) went through great effort in working out conditional frequentist methods in the realm of testing. The paper is close to 30 years old, but we are still debating about proper interpretation of p values.

You cannot improve clinical research without an understanding of decision theory, which links design of experiments (and the value of information) with the broader context. The fact that EBM went off and developed heuristics completely ignorant of well established mathematical results always seemed suspicious.

1 Like

EBM is about clinical decision making using research evidence and you are talking about “improving clinical research” - these are two completely different things. Clinicians are being taught how to use the literature (not really how to create the literature) in medical school and residency programs and that needs to be done robustly in medicine as in any other field. However, if you insist that car drivers cannot drive well until they know how to build a car, then you are simply mistaken. Yes, a clinician with research skills will be better able to target research where it is needed but that is not the main goal in medical schools and residency programs though the emphasis has been increased over the years. Finally, as @ESMD aptly put it “repeatedly impugning the competence/critical thinking skills of an entire profession” is not helful at all.

1 Like

This sounds like really important work. There are too many unanswered questions in medicine to keep studying the same ones ad infinitum, without clearly defining how we will recognize our answers when we see them. Hopefully your work will extend to the observational research space as well (where redundancy seems particularly pernicious) (?)

My point is that neither are done to any standard of competence because medical “thought leaders” decided it was unnecessary to learn the mathematical tools that other scientific fields take for granted.

I wouldn’t agree with that - EBM was introduced by medical thought leaders who decided it was necessary for physicians to learn to use the quantitative tools of epidemiology and biostatistics for optimal decision making. For example Clinmetrics was written in 1987 by the founder of clinical epidemiology - Alvan R Feinstein around the time that EBM was born.

Explain this result then (from June 2022):

Findings: In this survey study of 215 physicians, most respondents (78.1%) estimated the probability of a medical outcome resulting from a 2-step sequence to be greater than the probability of at least 1 of the 2 component events, a result that was mathematically incoherent (ie, formally illogical and mathematically incorrect).

1 Like

Statistics is quite hard and even trained statisticians often misinterpret p-values. This is not new. Physicians spend a lot of time learning biology, physiology, drug interactions, pathophysiology etc. This context is essential in every aspect of medicine, including clinical decision-making. Without it, words like p-values or Bayesian decision-making are empty.

Knowledge generation and interpretation is largely based on teamwork. Or at least it is certainly fun when that is so. This includes working with physicians, pharmacists, statisticians, epidemiologists bioinformaticians, biologists, computer scientists etc. We learn from each one.

And with enough such experience one begins to see that there are different types of statisticians in the same way there are different types of physicians. I would definitely not want a theoretical statistician with little applied experience to design, conduct, or interpret my clinical trials. For that I go to applied biostatisticians with specific skills. In the same way, while we should certainly listen carefully when an ID physician recommends an antibiotic for a hospitalized patient with cancer, it would not be as wise to let them choose the chemotherapy alone.

4 Likes

I will be the first to admit that statistics is hard, but the JAMA problems are not that hard.

What kind of statistics is going to be taught when the students can’t do algebra?

The idea behind the proposal is twofold. First, algebra generates more student failure and attrition than almost anything else. (One of the guest speakers at Aspen said that his one piece of advice to any college president looking to improve graduation rates would be to fire the math department. We laughed, but he didn’t seem to be kidding.) Second, in many fields, algebra is less useful than statistics.