Conditional logistic regression analysis with some matched sets broken when stratified by another variable

Hui · July 23, 2021, 8:48pm

Dear all,
When I have a 1:1 matched data and do conditional logistic regression I always ask for stratification analysis (for example to do analysis for two subsets: smoking and non-smoking). In this way the matched set could be broken if smoking is not a factor for matching (for example in a matched set smoking=1 for case and smoking=0 for control). Now what I did is to exclude broken matched sets in my analysis if the proportion of breaking sets is small. But I believe this is not a good method, especially in a ‘big breaking’. I have googled this issue several times and didn’t receive any related information. Is there any method to deal with this issue? Thanks!

s_doi · July 24, 2021, 6:56pm

If it is important to show the results by strata the only thing I can think of is to match first on the stratum variable (i.e. include the stratum variable among the matching variables). Not sure if this introduces any problem if this variable is not a confounder

Hui · July 30, 2021, 9:51pm

Thank you, s_doi, for your input.
This is always an issue in epidemiological data analysis. Sometimes I just exclude broken sets in stratification analysis if the number of broken sets is smaller. But I don’t think this is good.

f2harrell · July 31, 2021, 12:22pm

How was the matching done in the first place? Matching is useful (saves resources) when it’s expensive to sample all individuals. Whenever I see matching done in a way that discards already paid-for data, I think there is a better way.

Hui · July 31, 2021, 7:16pm

Thank you, Frank! I totally agree. I think matching is normal in the first place. But sometimes we may do the analysis to explore effect of an unmatched factor on association, which may result in an issue of matched sets broken if we do stratification analysis by levels of this unmatched factor.

f2harrell · July 31, 2021, 8:34pm

So just to confirm, the matching did not discard any data already collected?

Hui · July 31, 2021, 9:19pm

No. We may randomly select qualified controls depend on aims of studies.