Another null misinterpretation in a study about the association of epidural analgesia and offspring risk of autism

A recent cohort study published in Jama Pediatrics investigated the possible association between epidural labor analgesia (ELA) and offspring risk of autism spectrum disorder (ASD). They included 123175 offspring born between 2005 and 2016.

The authors summarize their main finding as follows:

[…] after accounting for maternal sociodemographic, preexisting, pregnancy-related, and birth-specific factors, no association was found between ELA exposure and offspring risk of ASD.


Results of this study suggest that ELA is not associated with an increased risk of ASD in offspring.

This is based on the following result:

After adjusting for maternal sociodemographic, prepregnancy, pregnancy, and perinatal covariates, ELA was not associated with an offspring risk of ASD (inverse probability of treatment–weighted HR, 1.08; 95% CI, 0.97-1.20).

This seems like a blatant contradiction to me and another case of null misinterpretation: A non significant finding is interpreted to mean “no association” or “no effect”. Let’s have a look at the p-value function:


The counternull, the HR that has the same p-value as the null (i.e. HR = 1) is roughly 1.17. Thus, the study supports the conclusion of “no association” with the same amount of evidence as an increase in risk for ASD among the exposed of 17% (it offers roughly 2.73 bits against both hypotheses). The p-value function also shows a high compatibility with HR > 1 compared to HR < 1.

Disclaimer: My discussion is solely based on the information in the abstract, as I don’t have access to the full paper. Maybe the authors are more careful and nuanced in the main text.


How do journals keep making the same mistake over and over?

The one positive about the interpretation is the narrowness of the confidence interval.


I was hoping you had an answer to this question. My partial explanation – instruction in NP and Fisher’s procedures are taught without understanding them in a Bayesian context.

The counternull 1, the HR that has the same p -value as the null (i.e. HR = 1) is roughly 1.17.

For those interested, the ‘counternull’ paper is also worth reading, and is open access via the link as I write this. I think this is a complement to Matthew’s Bayesian Analysis of Credibility, discussed in this thread:


Interpretation of the “confidence interval” aside, the MUCH larger issue here is the question of why studies like this are being done in the first place.

From a clinical perspective, these types of studies drive me absolutely bonkers. Many of them originate from my country and many of them appear in this particular journal.

I couldn’t access the rest of the paper but the abstract suggests that previous observational studies might have found “statistically significant” associations between epidural analgesia and autism. Based on past experience reviewing many similar “risk factor” perinatal epi studies, here’s how the story tends to play out:

  • Researchers have access to a database that includes pregnant women;
  • Researchers scour the database and find “associations” between some type of drug/exposure among pregnant women and autism (or take your pick of any other poorly-understood outcome that expectant mothers are terrified of…); ORs for initial studies are often around 2-4;
  • Researchers do a press release and run to media to hype their findings;
  • Women see these reports in the news and become terrified, flooding doctors’ offices with frantic phonecalls (ask any family physician/OB/Gyn/psychiatrist) and causing many to abandon therapies that they might really need;
  • Huge sums of money are directed, over the next decade, toward researchers who run multiple additional database studies of the same question in an attempt to address the “residual confounding” that might have accounted for the earlier finding. Usually they tweak some aspect of the study design that they feel is essential to getting the “right” answer. The recurring line in every conclusion is “more research is needed…”. These research funds divert money away from other topics (e.g., how best to financially support poor pregnant women);
  • As time goes on, point estimates in subsequent studies get closer and closer to 1. Eventually, confidence intervals cross the null, at which point everyone feels relieved (but many women have been harmed in the interim).

Let’s get real. What are doctors and patients supposed to do with the results of such studies? Are we really likely to abandon epidural analgesia based on results of any observational study that suggests an observational “link” with autism? Are we really so confident in these research methods that we would deny women pain control during labour based on this type of result? Are we ever going to be able to run an RCT to answer this question? The answer to all these questions is “no.”

Let’s pretend we’re the doctor fielding those frantic phonecalls from pregnant women that always come in after JAMA publishes their initial studies with OR 2-4 (and the author does the media rounds). What do we tell these terrified women? Do we say: “epidurals during labour might be associated with an increased risk of autism in your child, so you really might want to think twice before getting that epidural” (?) Really?? Knowing the above history of “risk factor” epi, we’re really going to say this?

Apologies for being so blunt, but this type of study is emblematic of exactly what’s wrong with epidemiology these days. Before running these types of studies, researchers really need to get a handle on how their results are going to be interpreted by the people they affect most, and whether any particular finding is likely to be clinically actionable. Otherwise, the net result will be harm to patients.


There was a big uproar within the pediatrics community when this paper was published last year, with a number of letters to the editors and wven a recent editorial note: More on Epidurals and Autism | Autism Spectrum Disorders | JAMA Pediatrics | JAMA Network

1 Like

After adjusting for maternal sociodemographic, prepregnancy, pregnancy, and perinatal covariates, ELA was not associated with an offspring risk of ASD (inverse probability of treatment–weighted HR, 1.08; 95% CI, 0.97-1.20).

Using these values to derive the “honest skeptic” prior, I calculate a skeptical bound of a prior odds no higher than 1.16. Since the observed OR falls within the skeptical bound, the language while not entirely correct, isn’t as misleading as it was in the Covid and smoking meta-analysis discussed in this thread.

1 Like

“How do journals keep making the same mistake over and over?”