Statin-related side effects: the recent Lancet publication is biased toward false-negatives

Dear DataMethods Users,

A recent Lancet meta-analysis assessed whether adverse events listed in statin product labels are causally attributable to statin therapy, using double-blind randomized trials and false discovery rate control across multiple outcomes. The authors concluded that the evidence does not support causal associations for most labeled adverse effects, a conclusion that was rapidly echoed in headlines such as “Statins do not cause the majority of side effects listed in package leaflets”.

However, in our new letter we argue that this conclusion arises from the combination of multiple, compounding sources of bias that systematically favor false-negative findings in safety assessment. These include stringent multiplicity control, threshold-based overinterpretation of statistically “non-significant” results, dichotomous readings of interval estimates, and reliance on randomized trial settings with limited real-world transportability together with intention-to-treat dilution under non-adherence.

Because “non-significant” does not mean “no causal effect”, we call for an interpretation framework centered on effect sizes and overall uncertainty, along with sensitivity analyses across alternative error trade-offs, risk strata, and dosages. Importantly, the consequences of inferential errors are not symmetric: false-positive signals primarily impose costs on researchers and regulators through additional scrutiny, whereas false-negative conclusions disproportionately affect the most vulnerable stakeholders, namely patients.

We look forward to your thoughts and welcome further discussion on this issue.

Rovetta et al., 2026, letter on ‘Assessment of adverse effects attributed to statin therapy in product labels’, v1.0.pdf (144.5 KB)

7 Likes

These are great points. I would also encourage us to try to answer a perhaps more clinically useful question: By how much does statin increase the overall severity/frequency of side effects? A “side effect ordinal scale” such as one that captures the worst side effect that happened to each patient, would allow this to be done. We need to get away, in my view, from multiple univariate assessments that are difficult to trade off.

My limited knowledge of the statin side effect literature was from a paper comparing placebo muscle pain occurrence to statin, and there was a higher incidence of muscle pain in placebo patients. That almost closes the case, doesn’t it?

6 Likes

As a clinician who worked in pharmacovigilance for many years, I’m very familiar with the methods used to assess drug-related safety signals. My work entailed causality assessment for potential drug-related adverse events and assessment of observational and experimental safety evidence used to develop and revise drug product monographs (PMs). After reading the Lancet article and your rebuttal to it, I’m- how shall I say this- “concerned by your concern:slight_smile:

I doubt that you’d feel a need to publish your letter if you had witnessed, first-hand, the pharmacovigilance work that’s done every day by international drug regulatory agencies. Enormous effort/manpower is dedicated toward detecting and investigating potential adverse drug reactions. These assessments are performed both at the time of initial marketing and then regularly in the postmarket setting. Drug sponsors are legally obliged to monitor and assess AE reports submitted to regulatory agencies and to their company and also to monitor the medical literature for any potentially new safety signals related to their products. Regulators duplicate much of this effort by independently surveilling the medical literature and searching for, then assessing relevant AE case reports. They open formal assessments into safety issues flagged in-house, by other international regulators, and by drug sponsors.

The amount of attention devoted to even the most obscure adverse events by regulatory agencies was, in my view as a physician, somewhat obscene, given:

  1. the very poor quality of most AE case reports;

  2. the lack of documented dechallenge/rechallenge for most reports (historical features that are usually needed permit definitive causality assessment for individual cases); and

  3. the differential impact of PM AE lists on prescribers (virtually no impact) and patients (a huge psychological impact for certain subsets).

Many times, during my tenure in this field, I felt that we ran a real risk of harming, rather than protecting patients - if we weren’t very careful.

Given my own experience working in pharmacovigilance, I agree completely with the authors of the Lancet article. The precautionary principle has been such a powerful guiding force in the field of pharmacovigilance for so many years and statins have been on the market for so long, that current statin PMs have likely become grossly polluted with AEs that stemmed from “half-baked” drug safety investigations conducted many years ago using suboptimal lines of evidence. The RCT experience with statins is now SO huge, after all these years, that it seems very reasonable, to me, to revisit very long lists of uncommon AEs, now that more and more of these rare AEs have been accrued in the collective RCT record.

As a physician, the reason I feel so strongly about this topic is that I see the harmful psychological effects that patient information leaflets can have on patients’ health-related decisions. Many pharmacists provide these leaflets when they dispense a new drug to a patient. The leaflets often list a dizzying array of uncommon AEs. Some patients become overwhelmed by these lists and, unable to contextualize them appropriately, decide not to take the medication that they’ve just been prescribed. As a result, their symptoms remain unaddressed or they might forgo a medication that could prevent important disease. Another, even more common phenomenon, is the patient who develops a physical symptom, looks up the PMs of all his prescribed medications, sees the AE listed in one of the PMs, assumes that his drug must be responsible for his symptoms, and decides, unilaterally, to stop his treatment, without first consulting his physician. If he suddenly develops embarrassing flatulence, he might be more apt to blame a medication he’s been taking for years, without incident, than his sudden new fondness for lentils- simply because he saw flatulence” on the PM AE list. People of all educational levels and backgrounds can be very susceptible to the post hoc ergo propter hoc fallacy. Indeed, the powerful human tendency to assume consequence from precedence is often impossible to dispel, even through patient explanation.

For these reasons, it seems not only plausible, but very likely, that lives could be saved by removing, from statin PMs, mention of AEs that lack solid current-day evidence for causality.

I respect your effort to encourage appropriate use of statistics when assessing drug safety. And I believe your motivations are sincere, since your letter acknowledges the clear cardiovascular benefits of statins. But I fear that your motivation will still be questioned by some since your letter implicitly suggests that the authors of the Lancet study are using opaque statistics to obscure the “true” risks of statins. To this end, I fear that your letter could further fuel the dangerous conspiracy theories that have swirled for decades around this class of drugs.

8 Likes

Dear Erin,

First of all, I want to make it clear that I, too, am convinced of your genuine intentions in writing your comment.

That said, I disagree with almost everything you wrote. To begin with, if the prior assumption is that plausible AEs are virtually absent based on the existing literature, then there is no need to resort to a series of statistical rituals grounded in nullism, dichotomization, reification, and the significance fallacy to claim as much. If there is an initiative aimed at evaluating adverse effects, then this should be conducted with the highest level of methodological rigor. I find it incorrect - and dangerous - to criticize papers that point out inconsistencies based on the very strong assumption “any method is acceptable since the results will ultimately be interpreted in light of the belief that AEs are rare.”

Second, I cannot endorse your general narrative for three main reasons: i) the accounts of clinicians and methodologists I know - some of whom have worked directly for regulatory agencies; ii) what has already occurred in the history of medicine (including striking examples, like the Vioxx case); and iii) the evidence that has emerged in recent years regarding phenomena such as regulatory capture and the large influence of large pharmaceutical companies on decision-making bodies (e.g., [1], [2], [3], [4]). The current system of evidence generation is not neutral and tends to overstate the strength of the evidence produced according to very clear incentives - ranging from securing funding to the need to take political positions.

To give just one good example among many, consider the case of the WHO, which at the beginning of the pandemic stated, “it is very clear right now that we have no sustained human-to-human transmission” with reference to SARS-CoV-2, and published outreach posts containing statements such as “FACT: #COVID19 is NOT airborne.”

There would be further aspects to address (e.g., the establishment of cost-benefit assessments based not on surrogate endpoints but on patient-centered and multivariate endpoints). Nonetheless, I think what I have outlined above is more than sufficient to argue that, for the good faith I believe guides your comment to be fully justified methodologically, a far greater effort is required than simply adhering to priors that are convenient for the most powerful and robust stakeholders.

Openly calling out what one considers to be methodological flaws in research - regardless of the topic under examination - is, and must remain, one of the indispensable ethical and epistemic principles of scientific conduct, in order to safeguard both public health and the production of knowledge.

1 Like

Dear Frank,

I am not familiar with the history of the statin literature, so I will refrain from commenting on that specifically. That said, I consider it highly implausible that a single paper could be regarded as conclusive, starting with the well-known problems of transportability. The effort I am asking for is simply this: do not canonize the literature, because 1) even under the best of circumstances, it is extraordinarily difficult to guarantee its robustness - the history of medicine makes this clear with cases such as the Women’s Health Initiative - and 2) the current landscape is driven by high-level interests that range - as I argued in my response to Erin - from the pursuit of funding to the endorsement of political and ideological positions.

We are developing increasingly sophisticated methods that rest on extremely strong and difficult-to-handle assumptions (e.g., g-methods), in a context where the scientific community has proven capable of adopting and institutionalizing rituals - such as p < 0.05 - that had far less justification (virtually none, I would argue). More than 100 years after the earliest warnings about the distinction between “statistical significance” and “practical significance,” we still find prestigious journals that fail to differentiate between the two concepts.

For these reasons, I continue to regard my strong concerns about the entire process that generates data and results as well founded, and I believe it is a deontological duty of methodologists - especially those far more skilled than I am - to maintain the highest level of vigilance.

2 Likes

As a patient advocate on the CIRB I reviewed consent documents and came away with a favorable view of how possible adverse effects are described to the patient - directed by the NCI template … including those that are rare and severe - life threatening. In it we used simple fractions to describe the risk as in “In 100 persons taking this drug, up to 20 have had …” You get the idea.

…To attribute rare effects to a drug is challenging and sometimes impossible with current knowledge if there’s no plausible mechanism that points to cause and effect relationship. So for these effects, rare and without known mechanism, the patient needs to understand that chance is a common and likely explanation. Perhaps with language along these line: - that something happened after X doesn’t mean that X caused it. “The rooster’s crowing doesn’t cause the sun to rise.”

Anyhow, my point in commenting is that effort I feel is best directed at the language of possible side effects’ and when attribution is uncertain and mechanism is unknown for the rare effects, that has to be explained in plain language.

3 Likes

(To this, I’d add the story Irving Kirsch tells in The Emperor's New Drugs - Wikipedia, which I’ve been meaning to post on here — for a variety of reasons — since finishing the book last Summer. The story here indeed illustrates your point iii as well.)

1 Like

Even though I hate null hypothesis significance testing, I don’t believe that the statistical methods used for analysis of AEs are the explanation for not finding safety concerns, and find @ESMD most convincing. A better Bayesian analysis would convince even more, e.g. for lots of AEs compute P(absolute risk increase > 0.01 or odds ratio > 1.15) scrapping p-values which can detect trivial risk increases.

@ESMD I wonder if you agree with me that reporting counts of AEs in labels shares some of the fault. I’ve always thought that absolute risks should be emphasized much more.

5 Likes

Providing absolute risks differences would help contextualize AE info. Another form of contextualization (not sure a Fundamentalist Bayesian would agree tho’) would be substantive knowledge about mechanism: is this rare AE well-understood, or is it (so far) just a data artifact lacking a plausible mechanistic explanation?

4 Likes

Product monographs do present the absolute rates of AEs recorded in clinical trials, for both trial arms. This is helpful. The less helpful portion of the PM is the “postmarket AEs” section- a long list of AEs based on spontaneous postmarket reports, for which little evidence of causality might exist. The Lancet authors seem to be suggesting that causality for listed postmarket AEs needs to be reassessed now that clinical trials have accrued sufficient numbers of these AEs.

An important caveat: between-arm AE rate comparisons using clinical trial databases will not be sufficient for detecting ALL possible drug-related harms. Specifically, idiosyncratic harms can still occur for many drugs on the market (e.g., idiosyncratic drug-induced liver injury). These AEs will usually be so rare as to be undetectable in typically-sized clinical trial databases. If an AE case report is very well documented and contains important historical details like positive dechallenge and/or positive rechallenge, we can still make inferences of causality at the level of individual patients. AEs like this should be flagged in PMs, but their mention should include appropriate caveats about their extreme rarity and (in most cases) unpredictability.

6 Likes

Great points. For common side effects such as muscle pain or weakness, I will take the RCT data any time over observational casually-reported data. Comparative (against placebo) AE differences are so much more trustworthy.

4 Likes

Shouldn’t these randomized trials be designed to be powered to detect adverse outcomes? Are they not primarily powered to detect favorable effects?

Sorry if my question is inane, but my knowledge of and experience with double-blind randomized controlled trials is limited

1 Like

I really like this N-of-1 trial in NEJM:

N-of-1 Trial of a Statin, Placebo, or No Treatment to Assess Side Effects

5 Likes

Wow this is cool! It’s from my colleagues at Imperial College London before I started working with them. They are really innovators. Randomizing to an empty bottle for nocebo - what a great and simple idea, and very symbolic. And once again, worries about self-reported muscle pain/weakness from statins has had serious shade put on it.

2 Likes

Adverse events (AEs) are adverse clinical events which occur at some point after a patient has taken a drug. Unlike the term “Adverse Drug Reaction,” the term “AE” does not imply that the drug necessarily caused the event. For example, traumatic injuries sustained by a clinical trial subject who happened to be a passenger in a motor vehicle accident would be classified as “adverse events.”

A wide variety of AEs is recorded during clinical trials. At the end of the trial, an observed between-arm imbalance in the rate of a given AE might constitute a “safety signal” for that AE. The imbalance provides a hint that the drug might be capable of causing that AE. However, since AE rates are generally much lower than the rates of the primary outcome for which the trial was powered, definitive group-level inferences regarding causality for an AE are not usually possible at the end of a trial. Caveat: some trial AE imbalances are potentially so concerning (e.g., mortality imbalances favouring placebo) that they might jeopardize approval, even if causality has not been definitively established.

The number of patients exposed to statins in the context of clinical trials is by now so large that we have accrued a large number of reports for a very wide spectrum of AEs. And since the vast majority of these AEs have occurred at roughly equal rates among statin-treated patients and placebo-treated patients, we can infer that statins do not likely “cause” these AEs- at least not through a mechanism that is common enough to manifest as a measurable between-group difference.

2 Likes

I have read many methodologically excellent responses that nonetheless rest on assumptions that ignore the scenario I raised.

Even in light of what I documented in my previous comment, the impact of aspects such as vast systems of conflicts of interest and ideological commitments is not a conspiracy theory, but a hypothesis to be considered - at the methodological level - as a component of the mechanism that generates the data and conclusions we produce. I argue that any response that ignores that underlying political-economic-ideological (PEI) scenario is strongly biased (just as any response that adopts the PEI scenario as the sole explanation would be).

I also argue that there is a study that adopts methods heavily skewed toward false negatives and that incorporates rituals with no foundation in the literature (not even in the work of the authors who originally developed those methods and later saw them ritualized - such as Neyman, who in 1977 clarified the need to carefully choose the target hypothesis and referred to the ‘error of the first kind’ as the contextually most relevant one, not necessarily false positives). This aspect is highly problematic and cannot be ignored in light of hypotheses that include - I emphasize ‘include’ - the PEI scenario among the possible mechanisms composing the data generator.

Expressions such as “solid studies” or “solid data” can only sound like political slogans when some of the fundamental methodological aspects on which that assessment of solidity rests remain unaddressed. I reiterate that this is not a conspiratorial outcry, but a call for methodological rigor that must characterize the conduct of researchers - rigor that entails considering all contextually relevant hypotheses, including uncomfortable ones. I ask for a significant effort from everyone, especially those far more competent than myself, to maintain great caution given the troubling scenarios that unfortunately affect the scientific world. Because neutrality, transparency, honesty, and integrity are essential qualities, just as much as competence (perhaps even more so nowadays, given the increasingly vast and complex web of trust on which we all depend).

1 Like

Sheesh, Alessandro, next thing you’ll be telling us is obstetricians should wash their hands! :wink:

Seriously, though, I’m looking at this conclusion [spoiler alert] from the NEJM piece:

This 90% figure is remarkably close to Kirsch’s estimate that 82% of SSRI efficacy is attributable to the placebo effect. This might be one avenue along which you could pursue your PEI hypothesis.

1 Like

In my opinion you are significantly overstating things.

1 Like

I think this is the crucial question. Trials are not powered for adverse outcomes by political and economic choice. This is unethical; by the “first, do no harm” principle, a medication’s safety profile should be well characterized before clinical use.

Thalidomide, opiates, DES..

It’s complicated, because we don’t always know what the adverse outcomes will be. There should be some hints in the earlier phase studies, but, beyond this, trials should be ongoing, with moving estimates of the probabilities of side effects as they emerge. The trial should not be stopped until essentially the distributions of the major ones (or the most harmful ones) are well estimated. Afterwards, if a new, really harmful adverse outcome jumps out in post-trial surveillance (with some reason to suspect causality, as discussed by @ESMD), it would actually cause the original trial to regain equipoise, and that trial would begin again enrolling new participants.

3 Likes