Is the use of conditional logistic regression necessary in case-control study？

JiaqiLi · March 25, 2026, 2:07am

https://www.bmj.com/content/352/bmj.i969

Here are two key points in this BMJ article: first, conditional logistic regression is not essential unless the data are sparse; second, it remains necessary to control for confounding factors after matching.

I find these arguments quite persuasive, although they contradict the opinions of many researchers. I would like to hear further perspectives on this. Thank you very much.

f2harrell · March 25, 2026, 11:27am

There is an old literature about this which was fairly well captured in the above reference. I asked Norm Breslow about it at a seminar he presented at UNC when I was a student, and his reply was clear: If your model is saturated, e.g., you model age with many nonlinear terms, you’ll get the same inference for the exposure from the conventional logistic model as you do from the conditional one. So the conditional model is best used when the adjustment factors are hard to model, e.g., you need to match on occupation (and their are hundreds on them in the data) or you’re doing studies on twins.

There are lessons to be learned from propensity matching, where researchers have fallen into the bad habit of assuming that you don’t need to adjust for prognostic factors outside of the matching (pure confounding adjustment).

Ertan · March 30, 2026, 11:59am

@f2harrell Based on the article mentioned by @JiaqiLi and your explanation, could I draw the following conclusion? When confounding factors are expected to influence the outcome, rather than ‘forcing’ homogeneity between two groups (e.g., matching for age and gender) and performing a t-test or Mann-Whitney U test, it is statistically more robust to employ a mixed-effect model. By incorporating these confounders into the model as fixed effects, we can account for their influence more accurately while preserving the integrity of the data. Is this reasoning correct?

f2harrell · March 30, 2026, 7:50pm

I’m unclear where the random effects come from.

Ertan · March 30, 2026, 8:48pm

@f2harrell Pardon me, I am asking to clarify. Did you make a criticism regarding the logic and structure of the mixed-effects model analysis? Or was it about the fact that it should not be used in the mentioned case?

f2harrell · March 30, 2026, 10:01pm

Random effects would be used when there is clustering or possibly longitudinal data. Remind me of the study setup that warrants this.

Ertan · March 31, 2026, 6:53am

You are right, @f2harrell . I should have been more specific about the source of the random effects. I was thinking of multi-center studies or scenarios where patients are nested within specific surgeons or clinics.

My main point was that, instead of traditional matching and simple tests, it seems more robust to use a model that accounts for both individual confounders (as fixed effects) and potential clustering—like hospitals or surgeons—(as random effects) if the study setup warrants it. Would you agree that this approach preserves more information and provides a more accurate estimate of the exposure effect compared to traditional pair-matching analysis?

f2harrell · March 31, 2026, 11:23am

Yes and thanks for clarifying. Matching techniques ignore outcome heterogeneity. By not accounting for easily accounted for prognostic factors, the estimated effects on scales such as odds or hazard ratios will be attenuated towards 1.0.