Is Logistic Regression with cluster sandwich covariance estimator, the same as mixed effects modeling?

Hi Dr Harrell,
I am trying to model really rare patient outcomes (eg. revision surgery for dislocation in orthopaedics) using logistic regression. I believe that our high volume surgeons play a role into whether the patient has an outcome, along with other variables such as surgical approach (primary variable of interest) along with age, bmi, sex, and comorbidities. I want to model surgeons (13 of them, with samples ranging from 20 to 200 each) as a random effect instead of a fixed effect because I don’t really need to report the probability of those revision surgery outcomes between surgeons.

Question 1: Does the logistic regression with cluster sandwich covariance estimator, imply the same thing as mixed effects model with random effect (specifically from the glmer() from lme4 package? )

Question 2: How do I interpret the odds ratio from logistic regression with cluster sandwich covariance estimator, knowing that doctors play a major part of the whether the patient have outcomes?

Question 3: Because the outcome is rare, I am considering Bayesian. When I use the blrm(), I can use the argument cluster(doctor). However, the cluster function doesn’t do anything in the lrm(). I would need to do a robcov() to get account for the random effect. Is this observation correct?

Thank you!

When the random effects are not close to zero, i.e., when there is a good deal of outcome heterogeneity across patients, the random effects model estimates different fixed effects than the purely fixed effects model that is marginal (averaged over) with respect to patient effects. This is related to non-collapsibility of the odds ratio. When there is large heterogeneity that is not accounted for, the remaining (fixed) effects will tend to be underestimated.

When you use an after-the-fact cluster sandwich adjustment for the variance-covariance matrix, the odds ratios are marginal with respect to patients and doctors.

Bayesian random effects models tend to work better than frequentist random effects models.

A separate issue is what is the form of the within-patient correlation for revision procedures. Random effects assume a constant correlation regardless of time gap. Other approaches may fit the correlation pattern a bit better.

Descriptions of the difference between marginal and conditional effects that I have read up to now refer to conditional applying to the individual patient while marginal apply to the population. Presumably these are equivalent when the estimand is collapsible (relative risk, linear regression coefficient).

When you refer to a random effect, do you mean just random intercept or also random slopes?

We were originally talking about random intercepts but the discussion applies to both. Note that in the nonlinear model situation where there is strong patient heterogeneity, marginal estimates (of say the fixed treatment effect) do not apply even to a population.

Why not even to a population?

Because with non-collapsibility of the odds ratio the fixed effect odds ratio is an underestimate. Let’s say you omit important random effects but adjust for age and compare two treatment. You want the treatment comparison to be for people of the same age but it’s not quite that.

1 Like