Propensity score matching vs multivariable regression

f2harrell · June 4, 2021, 12:31pm

For the life of me I can’t imagine a situation where marginal ATE is useful. So in my world view PS without covariate adjustment is simply choosing an incorrect estimand and I don’t care about its robustness. In my current understanding marginal ATE is a function of the distribution of all the covariates in the sample, so it does not estimate a population parameter unless you apply sampling weights to reweight the data according to the study’s sampling scheme. Since we seldom have these sampling weights, it’s somewhat of a moot point. See this for related ideas.

albertoca · June 4, 2021, 3:39pm

After reading the example in the link I have two basic doubts. Sorry for being naive.
If adjusting for sex prevents the loss of statistical power of the logistic regression, why is the standard error of the Tx variable doubled in the multivariable model? What is the intuition that the coefficient of the therapeutic effect tends to zero if heterogeneity is not accounted for? Why can’t it tend elsewhere?

f2harrell · June 4, 2021, 5:35pm

This is covered in detail in the ANCOVA chapter in BBR. Briefly, \hat{\beta} increases in absolute value faster than SE increases, upon covariate adjustment.

s_doi · June 4, 2021, 6:44pm

Thanks, I had not though about it this way. While I agree with Frank about the lack of utility of such a marginal effect, it does interest me for other reasons.

So to take this further and leave adjustment out for the moment, Let us assume there is a binary intervention X, a non-confounding binary third variable Z and a binary outcome Y. So with PS weights to balance Z, OR(XY) should also be different from OR(XY) without PS weights so long as Z is prognostic for Y. In other words, if Z was not a confounder we get a marginal estimate that differs from that without PS weights simply because of the differences in distribution of Z in both scenarios. And if Z was a confounder we get the unconfounded marginal estimate and it would also differ from the previous two - right?

albertoca · June 4, 2021, 7:04pm

I think I understand.
It’s interesting because it’s counterintuitive.
Thank you, Frank.

ngreifer · June 4, 2021, 7:24pm

If Z is not a confounder (and assuming it’s not a mediator), OR(XY) with PS weights estimated from Z and OR(XY) with no weights are the same because the X and Z are independent (i.e., there is no imbalance to correct). They will both differ from O(XY|Z), i.e., the OR resulting from a logistic regression of Y on X and Z. Consider the following scenarios in which there is no confounding except possibly by Z:

Z is a confounder:
OR(XY), no weights → marginal OR, confounded
OR(XY), weights adjusting for Z → marginal OR, unconfounded*
OR(XY|Z), with or without weights → conditional OR, unconfounded*

Z is not a confounder or a mediator (but causes the outcome)
OR(XY), with or without weights → marginal OR, unconfounded
OR(XY|Z), with or without weights → conditional OR, unconfounded

The motivation for PS weights in the first place is that the assumptions required for an unconfounded marginal weighted estimate are easier to satisfy than for an unconfounded conditional estimate. No assumptions are required about treatment effect heterogeneity to estimate the marginal OR with weights; whether there is heterogeneity or not (regardless of the scale of measurement), the procedure for estimating the marginal OR with weights is the same. Strong assumptions about heterogeneity are required for the conditional OR to be unbiased (although I think Dr. Harrell would argue that those assumptions are often warranted because it is plausible for the conditional OR to be constant). Causal inference people hate making assumptions about heterogeneity, so they often believe the marginal OR is the best we can do without strong assumptions and lots of data, which is why it is the target of inference in so many methodological papers.

s_doi · June 4, 2021, 8:03pm

Agreed. But what if Z is not a confounder because it is balanced with respect to Y and at the same time prognostic for Y? In this case would OR(XY) with and without PS weights (estimated from Z) not differ?

ngreifer · June 4, 2021, 9:21pm

I’m not exactly sure what you mean. It doesn’t make sense to talk about “balanced with respect to Y”. Balance refers to the distribution of a covariate (i.e., Z) between the treatment groups (i.e., strata of X). If Z is independent of Y, then Z is not prognostic of Y. Regardless, if there is no confounding (by any variable), the simple marginal OR(XY) is equal to the marginal OR(XY) with weights applied (because the weights don’t do anything since there is no confounding to correct). (Note this applies to the population; differences between them may arise due to chance in a sample.)

s_doi · June 4, 2021, 10:49pm

Yes, agree. I think the confusion was that I was looking at Z and Y being unassociated and ignoring the fact that this should be within strata of X for confounding to be absent. Let’s take an example:

Z=0
	Y
X	1	0
1	25	10
0	75	90
Z=1
	Y
X	1	0
1	75	50
0	25	50

Though Z and Y are unassociated, there is confounding and OR(XY) with no weights is 2.33 and with PS weights estimated from Z is 2.95 and OR(XY|Z) is 3. Why they all differ is clear now.