Principles and guidelines for applied statistics

At Sander’s suggestion, I’m posting my list of guidelines for young applied statisticians.

Cheers,
Philip

  • Consider the underlying science. The interesting scientific questions are not always questions statistics can answer.
  • Think about where the data come from and how they happened to become your sample.
  • Think before you calculate. Will the answer mean anything? What?
  • The data, the formula, and the algorithm all can be right, and the answer still can be wrong: Assumptions matter.
  • Enumerate the assumptions. Check those you can; flag those you can’t. Which are plausible? Which are plainly false? How much might it matter?
  • Why is what you did the right thing to have done?
  • A statistician’s most powerful tool is randomness—real, not supposed.
  • Beware hypothesis tests and confidence intervals in situations with no real randomness.
  • Errors never have a normal distribution. The consequence of pretending that they do depends on the situation, the science, and the goal.
  • Worry about systematic error. Systematically.
  • There’s always a bug, even after you find the last bug.
  • Association is not necessarily causation, even if it’s Really Strong association.
  • Significance is not importance. Insignificance is not unimportance. (Here’s a lay explanation of p -values.)
  • Life is full of Type III errors.
  • Order of operations: Get it right. Then get it published.
  • The most important work is often not the hardest nor the most interesting technically, but it requires the most patience: a technical tour-de-force is typically less useful than curiosity, skeptical persistence, and shoe leather.
10 Likes