Propensity score model for observational study with few samples

ckogan · August 24, 2018, 6:09pm

I have a binomial outcome (death in clinic vs. survival till release) observational study looking at assessing the effect of one treatment vs. a control. There are about 25 cases of the treatment (which is less than the number of cases in the control). If I consider the rule of thumb that the number of predictors in a logistic regression should be no more than m / 10 = 25/10, I am able to consider 2 predictors. If I use a propensity score, then this allows me to include both the propensity score and the treatment.

However, in the book on Biostatistics for Biomedical Research (Harrell, Slaughter), the suggestion is to model:

Y = treat + log(PS/(1-PS)) + nonlinear functions of log(PS/(1-PS)) + important prognostic variables

In cases where there is not all that much data, is it better to just use the following model?

Y = treat + log(PS/(1-PS))

PaulBrownPhD · August 24, 2018, 6:39pm

i would just mention this recent paper by Maarten van Smeden et al. re events per variable Sample size for binary logistic prediction models: Beyond events per variable criteria

ADAlthousePhD · August 25, 2018, 9:38am

2 things confuse me about the initial question:

when you say 25 “cases” in the treatment do you mean 25 total patients or 25 deaths among XX atrial patients?

Please state more clearly the total number of patients in treatment vs control group as well as number of deaths in each.

you’ve kinda mentioned propensity scores without much information as to why you’re using them or how you’re computing the PS. A little more detail of what you’re doing would be useful.

ckogan · August 27, 2018, 4:51pm

When I say 25 cases, I mean 25 deaths, 67 that did not die. There are 69 control cases and 23 treatment cases.
I’m looking in Biostatistics for Biomedical Research (Harrell, Slaughter) section 17.1 - (hbiostat.org/doc/bbr.pdf). Harrell suggests that propensity scores can be used to adjust for nonrandom treatment selection when there is limited data compared with the number of covariates.

The propensity would be computed via a logistic regression model of Treatment ~ Risk_Variables. The propensity score would then be included in a logistic regression of Survival_Outcome ~ treatment + log(PS/(1-PS)).

The main purpose of the study is to test for a treatment effect.