At Sander’s suggestion, I’m posting my list of guidelines for young applied statisticians.
Cheers,
Philip
- Consider the underlying science. The interesting scientific questions are not always questions statistics can answer.
- Think about where the data come from and how they happened to become your sample.
- Think before you calculate. Will the answer mean anything? What?
- The data, the formula, and the algorithm all can be right, and the answer still can be wrong: Assumptions matter.
- Enumerate the assumptions. Check those you can; flag those you can’t. Which are plausible? Which are plainly false? How much might it matter?
- Why is what you did the right thing to have done?
- A statistician’s most powerful tool is randomness—real, not supposed.
- Beware hypothesis tests and confidence intervals in situations with no real randomness.
- Errors never have a normal distribution. The consequence of pretending that they do depends on the situation, the science, and the goal.
- Worry about systematic error. Systematically.
- There’s always a bug, even after you find the last bug.
- Association is not necessarily causation, even if it’s Really Strong association.
- Significance is not importance. Insignificance is not unimportance. (Here’s a lay explanation of p -values.)
- Life is full of Type III errors.
- Order of operations: Get it right. Then get it published.
- The most important work is often not the hardest nor the most interesting technically, but it requires the most patience: a technical tour-de-force is typically less useful than curiosity, skeptical persistence, and shoe leather.