Hope I’m answering your question. If you do a standard (frequentist) statistical test, you will get a p-value. This gives an indication how extreme your data are. A p-value of 0.04 for instance means: “If we assume that the null hypothesis of no effect is true, then your data are quite extreme. If we would repeat your sampling many times, only 4% of your data would be more extreme.”
Extreme data like this, may be the result of wrong measurements with faulty instruments. They may also be caused by something else. But if we exclude those, we are left with two explanations:
- either this is an coindicidence. Extreme events do happen. People throw the dice and roll a series of ten 6’s. It happens.
- or your null hypothesis of no effect is not true.
Usually we have some threshold, such as 0.05 (or 5%). We say: “Hey, these data are so extreme that I don’t believe this is coincidence, I reject the null hypothesis of no effect. I know that I may be wrong, that the null hypothesis is true and that I reject it wrongly, but I am willing to accept that Type I error”.
We can rephrase the two explanations as follows:
- I have extreme data and the null hypothesis is still true, or:
- I have extreme data and the null hypothesis is not true
Or even shorter: “given these results, the null hypothesis may be true or not”. That’s all that frequentist statistics learns us. The reason is that frequentist statistics gives statements about the data (“these data are extreme, assuming H0 is true”) and not about the hypotheses. If you want to know: “given these data, how likely is it that H0 is true?” then you need bayesian statistics. Frequentist statistics can’t answer that one.
Example: a group of patients responds well to a certain medicine. Does this mean this medicine will be effective in the population? Statistical analysis shows that IF the medicine would be ineffective, these results would be in the extreme 2%. Now your reasoning is:
- either this is coindicidence, my patients improved spontaneaously. Things like that happen. It is extreme, but it does happen
- or: my null hypothesis of no effect is not true, and therefore I must conclude that this medicine IS effective
If your colleague asks: so, does the medicine work? You can’t asnwer that. All you can say is: “given the results, the medicine may be work or not”.
If she asks: “How sure are you? What is your false positive rate?” then you can’t answer that one. “Either it works, or it doesn’t”.
A false positive rate (as used in daily life) would be: “of all 100 discoveries of medicines claimed to be effective, only 75 really work. We have 25 cases of false positives.”
This is different from Type I error, which says: “Assuming the medicine does not work, for every 100 times we say that the medicine does not work, we are mistaken 5 times in which it DOES work”
Hope this helps…if I’m stating the obvious, please say so and I will gladly remove this post. 