As a PhD student in epidemiology, I was indoctrinated to value the importance of small effects in public health research, since small effects applied to entire populations can lead to substantial net impacts.
However, years on, I almost never believe the small effects that I see published - and even when I do believe them, I certainly wouldn’t fault anyone else for their skepticism. Given how we currently analyze and publish most data (or at least how it seems to me), small effects are just so likely to be the result of smoke and mirrors that a lot of other things have to be in place for me to take them seriously (strong theory, expert consensus, triangulation, trusted researchers, etc).
With that in mind, a critical facet of epidemiology is its clear link to intervention, which in turn is linked to policy making bodies (I am mostly thinking about public health policy requiring politicians to sign off), who seem to have a hard enough time incorporating evidence into their decision making. And for anything politicized (as public health will always be as it attempts to weigh collective and individual responsibility) small effects that aren’t universally appreciated will always be at risk of being explained away by opponents. So while in theory I still think small effects can be important at the population level, I doubt their practical importance from an intervention/policy perspective.
So…should we abandon pursuit of small effects (and perhaps turn more attention to hunting for so-called big wins and improving implementation of what we already know works)?
While I generally agree, in some scenarios, a small effect is more plausible than a large effect. Not everything can have big effects on outcomes we care about, and large effect estimates can suggest that the effect observed is just noise. Obviously this depends heavily on how the study was conducted, but I’m just saying that “hunting for big wins” is not necessarily a good goal.
Also, what would it look like in practice to abandon the pursuit of small effects? Presumably we conduct studies to estimate effects that we don’t know the precise magnitude of, and we would want people to publish the results of that estimation regardless of whether it turned out to be small or big…
I certainly understand your skepticism, but I also think small effects really can be worth pursuing. I often think of some examples that Rothman, Greenland, & Lash provided in Modern Epidemiology. For example, the relationship between smoking and cardiovascular disease or environmental smoke and lung cancer. Effects are not substantially large, but they seem to be there and are thought to be causal.
I think it depends on whether i) there’s a public health intervention that could be applied, ii) the evidence of your small effect is compelling enough that it can’t be dismissed (i.e. is the study rigorous enough to believe that the small effect is real), and iii) the small effect scaled up to the population really results in millions of dollars saved or thousands of disease cases prevented.
I think Darren has a background in nutritional epidemiology (poor soul) so let’s take a topical example from that world that keeps floating up periodically: does bacon cause cancer?
(disclaimer: I love bacon and probably would not stop eating it no matter what it supposedly causes, but may reduce consumption to “occasional weekend brunch treat” if I was sufficiently convinced that it significantly increased my absolute risk of cancer)
There seem to be dozens upon dozens of studies of varying quality (mostly poor) that in some way look at this question. Most recent one I can remember of the top of my head which drew some publicity was a study that said processed meat increased the risk of certain type of cancer (bowel?) by 18% (on the relative scale)…but that increases the absolute risk over the lifetime from something like 5 percent to 6 percent.
Now, pretend that I’m a politician hunting votes from the vegan crowd, and I decide that I’m going to latch onto this study and pick a fight against BIG PORK by citing that processed meat (including bacon) causes cancer; in my next action as a public official, I’ll introduce a bill that levies a tax on all processed meat products since it is a known carcinogen.
As Darren alludes, though, the effect is modest on the absolute scale and the study questionable enough that opponents can poke lots of holes in it. Nobody fills out food frequency questionnaires well, the study was not well controlled, we can’t account for all the other confounding factors, etc. Is this study really strong enough evidence to warrant a public health action in response?
Anyway, I suspect that’s what Darren is getting at here:
Serious question: what is the point of anyone continuing to study whether bacon causes cancer? Does anyone believe at this point that this has “practical importance from an intervention/policy perspective” as Darren puts it?
Excellent discussion. To deal with big picture issues, we are stuck because of at least two things:
High-dimensional epidemiology, especially genetic epidemiology, has resulted in over-estimation of effects such as odds ratios to such a degree that scholars such as Ionnidis and Ransohoff have advocated that large odds ratios be instantly disbelieved
Small effects are easily biased unless the estimand was pre-specified and the exposure was randomized. In general, one could say that the smaller the effect the better the quality of research required to publish it.
Apart from some of the well-known issues with memory-based questionnaires, multiplicity and model selection, and residual confounding, I think this is also a field that’s filled with heavy bias. And I don’t mean systematic errors, but rather serious cognitive biases on the part of researchers who study and chase after these small effects.
Unlike other fields of epidemiology, such as pharmacoepidemiology or environmental epidemiology, where it seems very unlikely to me that researchers have a very intimate and everyday connection with the things they’re studying (for example second-hand tobacco smoke), food consumption is universal, and everyone has a belief of what constitutes a healthy diet. Especially nutrition researchers. And I think because the effects are so small, mixed in with noise, and because there’s so much flexibility with estimation methods, I think there’s a lot of room for producing results that are simply nonexistent.
So I think nutritional epidemiology is a very special case, and it’s also a field that faces several challenges given how diets are linked to so many things.
Part of our view of Epidemiology is also skewed by well to do OECD populations.
Their is a need to address a lot of problems in marginalized populations (refugees, developing nations, etc), where the desire for interventions to avoid harmful outcomes are tempered by uncertainty over either identifying the problems or determining which options works.
I would argue that in these settings, even small effects are helpful as they provide more informed decision making.
I think the concern of much epidemiology is an answer in search of a problem, rather than a problem in search of an answer.
informed decision making is just making decisions with the information at hand. So for me, if a patient wants to try an intervention, I can provide existing data and they can decide for themselves if they think it is of value (i.e. the decision maker decides if the small effect is meaningful to them). In settings of rationing, it help prioritize resources.
For context of developmental economics (which is way, way outside my area of expertise) I think their are examples where it makes sense to study small effects which may lead to positive/negative population level changes. i.e. Is it feasible and helpful to ensure Food fortification for refugee populations while giving humanitarian aid?
Thanks for responding. I was reading a review of a new book by a physician Seamus O’Mahony.
I would be interested what you, or anyone else, thinks about the themes in this review. It reiterates themes that John Ioannidis has raised in several of his presentations to medical colleges. Some scientists I knew in Boston had voiced some of the same concerns just when the Evidence Based Medicine movement began to take hold. To me there seem to be degrees of cognitive dissonance exerted by different subsets within different communities, which will continue unless we cede that our measurement tools only exacerbate the state of medical diagnoses and prognoses.
Raj, Good call. I plan to read the entire book. My strong hunch has been that evaluating the claims in that book and similar books will open up new research vistas, as they are written by physicians and journalists. Rigor Mortis by Richard Harris was another that received acclaim. Mentions the work of John Ioannidis in particular. I read it.
Challenge w/hunt for small effect sizes is that it incorporates so many of the known biases within the production of research.
Well crafted and thoughtful observational studies can potentially identify “trustworthy” small effect sizes. Obviously, this requires building on previous knowledge. I’m not familiar with the environmental tobacco exposure, but your pretest probability for this would obviously be reassuring given the mountain of literature on similar topics.
Within clinical medicine, much of our research comes from flawed datasources (claims/EHRs) that provide measurement and missing data challenges. When combined with healthy (and unhealthy) user biases, the challenge of hunting down small effect sizes necessitates smart design and exploitation of natural experiments.
Apologies for posting as a family physician (non-statistician, non-epidemiologist). I appreciate the discussion on this site and find lots of information that helps me with critical appraisal of medical research.
It won’t come as any surprise that many physicians are inherently skeptical of studies showing small effects. The concern is that studies showing small effects just don’t seem to have a sufficiently strong track record of reliability that we will trust them to inform most clinical decisions. While some of these small effects may be “real,” we need to be able to trust research findings more than this before we can allow them to influence patient treatment. I appreciate that there are newer epi techniques that are supposed to minimize bias and estimate “worst case scenarios” with regard to residual confounding, but I’m not hopeful that the inherent distrust of observational studies that has built up among clinicians over the years will ever change. Which is why it’s frustrating to see multiple observational studies on the same clinical questions, when study after study…after study… shows a small effect. Surely there are more pressing health care needs to address…
But I do think that studies showing small effects can and maybe should influence practice in situations where there is a weak potential signal of harm from a treatment that is widely used in spite of a lack of high quality RCT evidence of efficacy. A great example from the past few years is the controversy that arose over testosterone treatment of “age-related hypogonadism” in men. A few observational studies suggested a weak harm signal and these studies were taken quite seriously by regulatory agencies and clinicians because of the lack of prior RCT evidence of efficacy for testosterone supplementation.
Good observations. I think that in pharmacoepidemiology where one examines possible harm of drugs on the basis of an outcome that is a completely unintended consequence there is a bit less of a problem of confounding by indication, hence we can place a bit more trust in the results compared to observational efficacy analyses.
A catch-22 in this whole discussion is that one cannot trust “overly big” effects in epidemiology, because we’ve been misled too many times when confounding bias turned out to be extreme.
Thanks for the feedback. The point about potential suboptimal credibility of big effects is well taken (though “winner’s curse” scenarios often seem to be detected through failure to replicate). But these days, the vast majority of published observational studies seem to show relatively small effects.
There seems to be consensus that most of the “true” effects left to be found will be small and replication seems to have taken hold as a preferred method to convince readers of the credibility of small effects. But is replicability/consistency necessarily a strong indicator of credibility for a small effect (even if different methods/populations are studied)? Could replication of a small effect confer undue credibility on an association that is actually just consistently spurious/confounded?
Arriving at some type of consensus on this question seems critical. If researchers feel that replication of small effects will improve the credibility and actionability of their findings, they will continue to study the same questions over and over again. But if the people in positions to act on those results don’t agree that replicability confers credibility on small effects, then we need to take a long hard look at how we invest our research dollars.