Language for communicating frequentist results about treatment effects

I believe @Sander raises some important points, especially this,

The common way to mention p is to discuss the assumption of the null hypothesis being true, but few definitions mention that every model assumption used to calculate p must be correct, including assumptions about randomization (assignment, sampling), chance alone, database errors, etc.

and also this point,

I don’t believe abandoning p values is really helpful. Sure they are easy to misinterpret, but that doesn’t mean we abandon them. Perhaps instead, we can encourage the following guidelines:

  • thinking of them as continuous measures of compatibility between the data and the model used to compute them. Larger p = higher compatibility with the model, smaller p= less compatibility with the model

  • converting them into S values to find how much information is embedded in the test statistic computed from the model, which supplies information against the test hypothesis

  • calculate the p value for the alternative hypothesis, and the S value for that too

1 Like