Step-by-step topics to "start" in statistics

I’m a medical doctor and researcher working in clinical settings/universities.

Medical/biomedical students often ask me for recommendations to “really” start studying statistics. Some of them are interested in pursuing a post-graduation career in statistics and applied data analysis.
These students have little experience in linear algebra and almost no knowledge of calculus and programming.

A few years ago I would recommend an “introductory” statistics book. I feel now that most non-mathematical books are heavily focused on hypothesis testing and not a great place to start. For instance, most of these books “end” in linear regression models after chapters and chapters of parametric and non-parametric tests.

Assuming that these students have access to any resource (e.g. they could take mathematics courses/semesters at the university), what would be the recommended courses/topics/resources to begin?

Thank you in advance.

1 Like

To emphasize the essentials of applied stats, and not get mislead by the literature (what is in the literature is often not optimal) there are 4 books I recommend, because I’ve learned a lot from them.

  1. Biostatistics for Biomedical Research (link) by Frank Harrell – A good complement to any intro level textbook.
  2. Resampling Methods by Philip I. Good. A great applied text on modern frequentist statistics that substitutes computation for mathematical theory. There is a trend in intro statistics education to emphasize simulation methods, and link them to the classical parametric models to improve intuition.
  3. Permutation, Parametric, and Bootstrap Tests of Hypotheses by Philip I. Good. This is a more theory intensive text that complements Resampling Methods.
  4. Regression Modelling Strategies by by Frank Harrell – hard to go wrong here, as he emphasizes semi-parametric models, which are very flexible as a general rule. The big advantage is that he has filtered out the large number of techniques (which have value in particular contexts) in favor of an approach that is reasonable in almost any context. No one who understands statistics could fault you for properly using his approach. This might be tough to tackle as a first text, but having studied 1 or 2 will make the student well prepared.

Bayesian texts require a bit more math. I’ll post a few recommendations later.

4 Likes

I think we have to make a distinction between medical students and other biomedical students

For medical students we are aiming as the first step to make them users of the literature and not creators of the literature and thus a robust course in EBM that covers intuitive understanding of statistical concepts is what is needed. They need to understand key concepts in both epidemiology and biostatistics and understand their meaning well but do not necessarily need an advanced mathematics and calculus course. For example they should understand clearly what is a P value from a test and what it means and how best to interpret it but not necessarily what its mathematical backdrop is - that is a useful skill for a physician researcher who can then do a quantitative Masters or PhD but not a medical student

1 Like

Thank you for the response.
I’m sorry if I wasn’t clear.
These students are usually very engaged in clinical epidemiology/EBM/research methodology topics. It is through courses, disciplines and other initiatives that I meet them.
I think they are the ones that want to take a step further to be able in the future to perform analysis. Maybe, the first step before they start any masters in the field.