I believe @Sander raises some important points, especially this,
The common way to mention p is to discuss the assumption of the null hypothesis being true, but few definitions mention that every model assumption used to calculate p must be correct, including assumptions about randomization (assignment, sampling), chance alone, database errors, etc.
and also this point,
I don’t believe abandoning p values is really helpful. Sure they are easy to misinterpret, but that doesn’t mean we abandon them. Perhaps instead, we can encourage the following guidelines:
-
thinking of them as continuous measures of compatibility between the data and the model used to compute them. Larger p = higher compatibility with the model, smaller p= less compatibility with the model
-
converting them into S values to find how much information is embedded in the test statistic computed from the model, which supplies information against the test hypothesis
-
calculate the p value for the alternative hypothesis, and the S value for that too