Let's talk: The Statistician - non-statistician (mis)communication

Dear Data Method Community,

Communication has never been easy, but is very key in any collaboration be it long or short term.

In medicine we are taught ways to communicate with patients. But communication is a two way event, and unfortunately patients are not taught how to communicate with doctors, so in the course of a doctors practice, they learn the art of decoding what the patient might be trying to say. At times one can verify some diagnosis hypothesis with medical tests, or look for signs and symptoms that are independent of the patient’s ability to articulate what they think is wrong with them. You get the point. All this is largely because clinicians and patients don’t share a common set of vocabulary and/or thanks to google, a common understanding of medical vocabulary - you have all probably heard of the phenomenon where patients google their symptoms until they self-diagnose, and at times hilariously they reduce a clinicians role to one of writing their prescription for they will have also googled which drug to take. Then the clinicians role expands to de-diagnosing (if the patient got it wrong) the patient, before properly diagnosing them.

Similarly, statisticians and (most, I included) non-statisticians don’t share a common set of vocabulary, or at times the understanding of the same vocabulary, so I can imagine, that the “Statistician - non-statistician communication” within a collaboration is not easy either. A non-statistician might approach a statistician with a “simple” question which later turns out to be not-so-simple after discussion, or a statistician gives suggestions which are later implemented in ways that make it obvious (to the statistician at least) that he/she was clearly completely misunderstood.

Thus am interested to learn about:

  1. As a statistician, what has been your experience?

  2. As a non-statistician, what has been your experience?

  3. From your experience, what have you done (or are some of the ways) to improve this communication?



Great points Nelly. I expect there are lots of similarities between the doctor-patient relationship and the physician-statistician collaborative relationship. Just as patients are often quite willing to be “de-diagnosed” (great term) by having misunderstandings of medical information directly addressed, so too must physicians be willing to admit they don’t have all the answers when it comes to stats. Although it can be hard for physicians to relinquish control, sometimes we just need to let other people drive.


I have a pretty long list of issues that (as a statistician) at one point of another have lead to confusion due to it being unclear what some terminology means/how it should be interpreted/due to previous education (or miseduction) of stakeholders:

  • (Statistically) significant with <= 0.05 is not the same thing as something that is true without any doubt ever being allowed, again
  • (Statistically) significant with <= 0.05 does not mean that a new identical study has 95% probabilty (or is certain) to replicate an effect estimate of this size or larger (or to have a p<=0.05 again; or various other variations on this)
  • (Statistically) significant is not the same thing as clinically meaningful
  • a patient being a “responder” defined by some continuous outcome being dichotomized to be above/below some level does not mean that a patient benefitted from the treatment (= that they had a better outcome than they would have had without treatment)
  • X% of patients being “responders” (in the sense defined above) does not mean that 100-X% of patients did not benefit from treatment
  • 95% confidence intervals for means do not cover 95% of the data distribution
  • 95% confidence intervals for means do not contain the true mean with 95% probability, nor does a p-value <= 0.05 mean that we are 95% sure that the alternative hypothesis is true
  • if you have 90% power in a study given a true difference of X between group A and B, differences less than X will still be statistically significant and there is not a 90% probability that your study will result in a difference of exactly X (this only gets worse, if statisticians you sloppy language like “90% power to detect a difference of X”)
  • a p-value >= 0.05 does not mean that we are (95%) sure that there is no difference between groups
  • a non-zero within-group change from baseline (even if you do a hypothesis test for that and p<<0.05 - Why, oh why, do statisticians ever calculate these p-values???) does not mean that a treatment works
  • the above point does not change if you decide to not have a control group
  • the (observed) treatment difference in one clinical trial is not the one true underlying treatment difference in this type of patients or every single type of patient
  • stratification of randomization does not automatically mean that you recruit patients in each stratum in equal numbers
  • if you look at 30 subgroups in a 100 patient study for 10 difference outcomes, you may not want to take the point estimate at face value and the p-value & confidence interval do not have the same interpretation, as if you had specified the best subgroup as your single analysis (always a great discussion to have with clinicians that have a copy of GraphPadPrism…)
  • hardly anyone understands the Hodges-Lehmann estimate (and geomtric means, ratios of geometric means, rate ratios, hazard ratios, risk ratios and odds ratios are commonly misunderstood in many ways)

I can probably continue this list endlessly.

I came up with fewer examples of physician language being completely misunderstood by statisticians (one example: MedDRA system organ class does not mean primary system organ class). I guess I’m not sufficiently aware of my own blind-spots…


I am now retired but used to work as a statistician in a medical school. I often said I was the statistician counterpart of a GP.
My original training was maths - more ‘pure’ than conventional ‘applied’ which used to mean essentially maths for physics and engineering. In my Cambridge degree, I was taught by several men who, I later learned, had been at Bletchley Park. Some of my colleagues were involved in cracking Fermat’s Last Theorem, including John Horton Conway, sadly recently deceased with Covid-19.
The pure mathematician Caucher Birkar wrote ‘Mathematics is not designed to be described in words. It is designed to be described in mathematics.’ Be that as it may for pure maths, it is the EXACT OPPOSITE of the situation that applied throughout my work. Communication was essential. Whenever I analyse a set of data, I reckon I spend at least as long on the writeup as the programming.


In the past few weeks, I’ve found it helpful to come to Data Methods and search the archives for threads that often provide useful context to a question I have regarding statistical reasoning.

I hope the following thread helps clinical researchers and statisticians understand one another better. I’ve found Sander Greenland’s posts and peer reviewed papers enormously helpful.

Physicist Richard Feynman had a wonderful discussion on the relationship between mathematics and physics. I think substituting “statistician” for “mathematician” and “clinician” for “physicist” is a perfect metaphor for the communication problems described here.