BBR Session 9: Logistic Regression for Comparing Two Proportions

f2harrell · January 8, 2020, 5:35pm

This is a place for questions, answers, and discussion about session 9 of the Biostatistics for Biomedical Research airing 2020-01-10 introducing logistic regression and using it for comparing two proportions. Session topics are listed here. The video will be in the YouTube BBRcourse Channel and will be directly available here after the broadcast. Bayesian logistic regression is also introduced.

ckogan · January 10, 2020, 4:31pm

Following up from a discussion during the video:
Question: I’ve run into confusion when trying to explain chunk tests with people familiar with ANOVA that have trouble understanding this. Anyone else had this issue?

Response: Chunk test = joint test of more than one parameter; often best envisioned by seeing the damage done to the overall model LR chi-square by deleting all the variables in the chunk. Worth more discussion

We’ve ended up explaining chunk test in text as: We used likelihood ratio tests to assess the overall combined association (main effects + interaction) between a variable and the estimated tendency (expressed as log odds) …

I don’t remember the exact nature of the confusion, but I believe in discussions it often gets confused with the test of the main effect: e.g., so you drop all terms in the model involving variable X somehow is frequently getting interpreted as the main effect of X for individuals familiar with ANOVA, but not rusty on thinking about the actual model with parameters.

Any suggestions/thoughts as to whether this is a good way of explaining in text.

f2harrell · January 11, 2020, 1:29pm

I think your statement about the likelihood ratio test is good. We just need to generalize that to handle Wald chunk tests. We could dispense with the phrases “chunk test” and “composite test” and just describe exactly what we are testing. Here are such descriptions for the two most commonly used types of chunk tests:

We did a combined test of the effects of weight and waist circumference with 4 d.f. (because we assumed a quadratic fit in both weight and waist) that tests the null hypothesis that neither weight nor waist is associated with Y.
We did an overall test of association with age and Y in the model containing an age x sex interaction, with 2 d.f. (age main effect + age x sex interaction effect). This tests for whether age is associated with Y for either sex, allowing the slope of age to differ by sex.

Chunk tests are more general than ANOVA as exemplified here:

For our study of 4 treatments (placebo, drug A, drug B, drug C) we did the ANOVA-style test with 3 d.f. to test whether there is a true difference between any two treatments, and also used the same statistical model to obtain the 2 d.f. test for differences between drugs A, B, and C.

Daniel_Brewer · January 17, 2020, 2:42pm

Why did you choose to use a normal distribution as a prior rather than a t distribution? https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations seems to recommend a t distribution. I really like the Bayesian versions of common models but I feel really uncertain on how to decide on the priors.

f2harrell · January 17, 2020, 4:07pm

Good question. I start with P(effect > z) which I control with the variance. You could just as well control this with the variance and the degrees of freedom jointly. May be a slight overkill.