This is my first experience using datamethods.org. Very excited!
My question… What are some appropriate statistical analyses for predicting response to treatment, other than regression? Can likelihood ratios be used, if study participants are classified as “responders” and “non-responders”? What are some options? I am especially interested in methods for analyzing data from a relatively small sample size.
It is not appropriate to classify participants as ‘responders’ or ‘non-responders’ unless this is how the fundamental data are generated, e.g., the outcome is all-or-nothing. As Stephen Senn has written about so extensively, it is usually impossible to really get a grip on ‘response’ for individual patients.
Changing methods doesn’t help with small sample size. In some cases, uses methods that cross-classifies data results in even smaller effective sample size because of intrinsic allowance for more interactions than we would usually have in a regression model.
Likelihood ratios are mainly useful with a binary outcome and a binary test. There they are useful for better understanding the test, and have major advantages over sensitivity and specificity. For binary X and binary Y, the odds ratio in a binary logistic model is the produce of the likelihood ratio positive and the likelihood ratio negative. But for the multivariable situation, statistical models are generally preferred over likelihood ratios.
My experience is using frequentist statistical approaches. But, I am very interested in learning more about Bayesian statistics. Do you have any suggestions for someone who wants to learn Bayesian methods?
I would like to add on what @f2harrell wrote. The key is how you define response. A yes/no response is tricky and in most of the times a prediction model will perform poorly. If you can consider a quantitative outcome you will do much better.
But there is also another issue you need to consider: how do you define the exposure? If you are in a clinical trial setting, and adherence is acceptable then you are good. In observational studies the exposure definition has a considerable impact on the quality of the model. I do not have any specific reference right now, but I recommend looking at recent work by M.Abrahamowicz from McGill university