AUC of each subgroup is smaller than overall AUC

I have a validation data set of 29242 patients, with known labels/health outcomes and predictions that were generated by some model. 28626 patients had a negative outcome and 616 had a positive outcome. The overall AUC is 0.7134.

The reviewers asked to divide the validation data into two subsets/subgroups, defined by a pre-existing medical condition, and then to apply the prediction model to each subset separately.

Out of the of 29242 patients, 4832 had this condition and 24410 did not.
The outcome by subgroup split is

        0     1
  0 24080  4546
  1   330   286

When I applied the same prediction model to each subset separately, The AUCs for the subgroups were 0.612 and 0.655. That is, the AUCs of each group separately are smaller than the overall AUC. How is that possible?

One explanation I can think of is that the pre-existing medical condition is an important predictor of the original model (the second highest SHAP value). Another explanation may relate to the balanced outcome withing the subgroup with the medical condition.
What do you think?

Your first explanation is the correct one. And in general when you stratify on an important variable, the within-subset assessment is now for a more difficult discrmination task. Hence the c-index, like R^2, will be smaller.

The reviewer’s request is unreasonable for a variety of reasons. If the subgrouping variable is thought to “mess something up” you should look at interactions with that variable in the context of the over-arching model.