Language for communicating frequentist results about treatment effects

What do you think about limiting the use of the term “evidence” strictly to the discussion of relative fit of 2 models to data (as is done via the likelihood ratio) and refer to the information contained in the p-value as “surprisal” as @sander recommends? I’m beginning to think that would clear up much confusion. If I had to describe results with high surprisals, I’d use the language “sufficiently surprising if the assumed model is true.”

AFAICT, the surprisal is inversely associated to the likelihood ratio in the simple case of an asserted test hypothesis H_a and its complement H_\neg{a} in the sense that a high surprisal (low p) indicates that there (probably) exists a better model (model with higher likelihood) for the current data.

The term “surprisal” works when p-values are scaled to units of information relative to the assumed model, but I struggle to come up with a term when converting them to units that assume the alternative (ie. \Phi(p)).

1 Like