Propensity score matching vs multivariable regression

As far as I know (correct me if I am wrong), there is no clear advantage between performing propensity score matching (PSM) versus multivariable regression, for a survival analysis. I have read that PSM may be somewhat more appropriate when linearity cannot be assumed, however I am not sure how important this really is in survival analysis. Finally I have read that one of the problems of PSM is that when collapsing the effect of multiple variables to a score, unnoticed imbalances can be generated, which condition the analysis. However, it would be really nice if someone had some experience with this kind of problems, to illustrate in a practical way about the possible differences between approaches.
For me, one of the advantages of multivariable regression is that I can easily work with databases with missing variables, i.e with multiple imputation, but and I’m not so clear how to work with missing data with PSM. Thus I’m more comfortable with regression, but I’d like to ask the experts what they think about it.
Thank you.

1 Like

In general propensity score matching does not compete well with regression adjustment, for the reasons you gave and in addition because of the greatly reduced sample size from discarding good matches that are “unneeded”. I discuss this in some length in BBR - just search for propensity in the pdf.


I am also confused with a similar question. A statistician reviewer once suggested to use IPTW PS-adjustment instead of direct multivariable adjustment. How do you think of it?


So instead of stratifying by quintiles of ps is something like this better? How do you select the non-linear term them?

cph (sur ~ treatment + log (ps/1-ps) + log (ps/1-ps)^2 + covariates…)???

1 Like

Assuming ps is the estimated treatment probability log(ps/(1-ps)) [sic] is the linear predictor (lp) which you get direct from Predict(). One suggestion is to adjust for the lp using a restricted cubic spline:

cph(sur ~ treatment + rcs(lp) ...)

(although I would prefer covariate adjustment, if possible, for reasons given here)


If you are seeking a low precision low power method then weighted estimates are for you :slight_smile:


If you really, really want to use PS, meaning that your sample size is not large enough to support putting all of the covariates into the model, then adjust for key variables and the spline logit PS:
cph(sur ~ treatmeant + x1 + rcs(x2, 4) + rcs(x3, 5) + rcs(qlogis(ps), 4))


A good reference on this:

“Stratification for the propensity score compared with linear regression techniques to assess the effect of treatment or exposure” , Senn et al. (2007) Statistics in Medicine


I am not an expert but there are instances in which propensity score matching and multivariable regression adjustment provide very similar results. See here:

If you use propensity score matching I can understand why this is a problem, but what if you use propensity score adjustment or weighting?

One advantage of using propensity score matching or weighting is that the matching and weighting approaches let you deal with covariate imbalances during the “design phase” of the study. Once you are satisfied with the degree of balance in the matched or weighted sample, you can proceed to the analysis phase (which may or may not include further adjustment via multiple regression). Ho et al. (2007) and Harder et al. (2010) have nice discussions of this.

1 Like

If you are seeking a low precision low power method then weighted estimates are for you

Does IPTW PS analysis require observations to be discarded (like matching does)? If not, I don’ see why precision and power would be lower than a multiple regression model.

Weighting, whether in a sample survey situation or in the present situation, worsens the precision of estimates. Think of it this way: to upweight a stratum means you have to downweight other strata, which is similar to lowering their sample size. For example suppose your sample had 200 males and 800 females and you wanted (for some strange reason) an estimate of a population mean that is for a population with a sex ratio of 1:1. The weighting will effectively ignore some of the 800 perfectly good observations on females.


I was thinking of an idea that I suppose is absurd, but I really don’t know the answer. When I’ve done propensity score matching, for example through R’s MatchIt package, you basically get a sample that is a subset of the original data, matched by the propensity score. After that, you can make an estimate for example:
cph(sur ~ treatment + x1 + rcs(x2, 4) + rcs(x3, 5) + rcs(qlogis(ps), 4))
The question, sorry if it’s stupid, would be next. Since I use the propensity score as a covariate, and this variable “collapses” a large amount of information from many other covariables, how does the estimate of treatment effect change if I use the entire original sample or just the subset matched by the propensity score?
That is, if I am really adjusting for the effect of confounding factors through the propensity score, I would like to know why in addition to that the matching by the propensity score itself is required.
Please, Frank, forgive me if this question is utterly stupid.

It’s not stupid. If you are going to do regression adjustment using spline of logit ps, matching plays no role. Just be sure to show high-resolution back-to-back histograms of ps stratified treatment to see if there is a region of no overlap in terms of absolute frequencies. This might make you want to do the regression on a subset with adequate overlap. Even then, no overlap at all is OK if you are willing to make a strong no interaction assumption.

1 Like

I think it’s beautiful, even aesthetically.

And what happens if we don’t make that strong assumption but introduce
cph(sur ~ treatment + x1+ x2+ rcs(qlogis(ps))+ treatment * rcs(qlogis(ps)) into the equation in a setting with no overlap of propensity scores and no matching at all?

1 Like

That’s a great question. Minor point: the correct formula would be sur ~ treatment * rcs(qlogis(ps)) + x1 + x2). Thinking this through really helps with discovering the truly important assumptions in play. If there is little overlap in the ps distrribution across treatments, the treatment effect as a function of ps will have huge confidence bands. If there is no overlap at all, the confidence bands will be “huger”.

1 Like

Screwed no matter what we do with propensity scores?
Then better recruit more patients and use multivariable regressions, if possible.
Intuitively it seems better to me because PSM is something conceptually more abstract, difficult to explain to my neighbor in the elevator, for example…

My own opinion: PS is a useful supplemental adjustment variable when the effective sample size is not at least 5 \times the number of possible measured confounders. Use direct covariate adjustment for the k pre-specified important predictors where k is the maximum allowable for the effective sample size, and adjust for spline of logit of PS for the remaining potential confounders.