Is there any scenario where GEE use is a MUST ? I mean one where GLM use is not “appropriate” ?
Since GEE is a large sample procedure, does not provide a log likelihood, and it does not provide insight about various model structures (especially within-subject correlation) I avoid it. But in some hard-to-model situations, I use GEE. For example in looking at heart rate variability (HRV) in neonates, HRV is assessed each minutes for days. We summarized it on a moving 6-hour window and used the summary of HRV to predict an event by a fixed time. Each neonate had multiple HRVs that were very correlated because the 6h summary was obtained from overlapping 6h periods, advancing 1h each time. The outcome event was always the same 0/1 code per neonate so was redundant. Such a correlation structure would be hard to handle otherwise, but GEE with a robust cluster sandwich covariance estimator did the trick.
To add to Frank’s comments, when dealing with repeated (e.g. longitudinal) or clustered measurements on the same observational units, I tend to look at mixed effects (LMM/GLMM) models, even when I am primarily interested in marginal effects. In R, I then use the emmeans package to generate specific contrasts and tests for various comparisons over time and/or between groups.
Dimitris Rizopoulos has a nice overview of both GEE and LMM/GLMM models that you might find of interest vis-a-vis conceptual foundations and contrasts between the approaches:
It is an R centric presentation, but provides a conceptual framework that is of value generally.
In addition to mixed effects models, look at models that handle serial correlation patterns more accurately than assuming compound symmetry as traditional mixed effects models assume. Two types of models to consider are generalized least squares and Markov models. Markov models are the most flexible.