Appropriate statistical analyses for predicting response to treatment



This is my first experience using Very excited!

My question… What are some appropriate statistical analyses for predicting response to treatment, other than regression? Can likelihood ratios be used, if study participants are classified as “responders” and “non-responders”? What are some options? I am especially interested in methods for analyzing data from a relatively small sample size.


It is not appropriate to classify participants as ‘responders’ or ‘non-responders’ unless this is how the fundamental data are generated, e.g., the outcome is all-or-nothing. As Stephen Senn has written about so extensively, it is usually impossible to really get a grip on ‘response’ for individual patients.

Changing methods doesn’t help with small sample size. In some cases, uses methods that cross-classifies data results in even smaller effective sample size because of intrinsic allowance for more interactions than we would usually have in a regression model.

Likelihood ratios are mainly useful with a binary outcome and a binary test. There they are useful for better understanding the test, and have major advantages over sensitivity and specificity. For binary X and binary Y, the odds ratio in a binary logistic model is the produce of the likelihood ratio positive and the likelihood ratio negative. But for the multivariable situation, statistical models are generally preferred over likelihood ratios.

Hope we here from others.


What about Bayesian regression?


The key is getting as perfect a dependent variable as possible. Then the choice of frequentist vs. Bayesian regression comes down to things such as

  • If you have prior information about one or more regression coefficients you can use that in Bayesian modeling
  • If you have too many predictors and need to discount their effects to reduce overfitting, Bayesian shrinkage priors are the best approach IMHO
  • Bayes also gives greater accuracy in cases where the log likelihood function is far from what a normal distribution dictates

Bayesian modeling per se doesn’t float or sink the project.


My experience is using frequentist statistical approaches. But, I am very interested in learning more about Bayesian statistics. Do you have any suggestions for someone who wants to learn Bayesian methods?


Richard McElreath Statistical Rethinking.


I would like to add on what @f2harrell wrote. The key is how you define response. A yes/no response is tricky and in most of the times a prediction model will perform poorly. If you can consider a quantitative outcome you will do much better.

But there is also another issue you need to consider: how do you define the exposure? If you are in a clinical trial setting, and adherence is acceptable then you are good. In observational studies the exposure definition has a considerable impact on the quality of the model. I do not have any specific reference right now, but I recommend looking at recent work by M.Abrahamowicz from McGill university