Hi there,

I’m trying to analyse a RCT. It’s commercially sensitive so I’ll have to talk abstractly I’m afraid. My outcomes are two psychometric scores (0- 100) but I’ll just talk about one, call it Y. Y is measured before the intervention Y_1 and after the intervention Y_2. We expect changes to be small (\Delta Y = 5-10)

The research question is ‘does the intervention (C) affect the increase in Y relative to the placebo?’

There are two problems I’m having.

**Problem 1: floor effects**

There are floor effects (i.e. large density of observations at 0) which *I suspect* is due to censoring - the measurement scale is insufficient to capture the true variation in the underlying construct being measured.

Obviously a linear model will not suffice. I’ve tried a Tobit model

(X = other covariates, C is placebo/treatment) but that only deals with censored outcomes I need to at least adjust for Y_1 which is itself censored. The residuals are also very highly associated with Y_1. For similar reasons I suspect quantil regression is out also.

The only other way I can see is to impute censored values of Y_1 and then use that in Y_2 \sim Y_1 ...

*Question: Do you have any suggestions beyond imputing Y_1?*

**Problem 2: High drop off**

the study was done online with a target sample number of 4000. The recruiter kept recruiting until that number was met, but a large number (2000) partially completed the study. So 6000 were recruited but only 4000 completed. Reasons for drop off aren’t clear but definitely not random.

As there were so many partial cases then imputing the outcomes is not feasible.

My question is: do you think this recruitment procedure biases the results and how can I contextualize the results? My thoughts are yes, complete cases are no longer random sample and will heavily influence to result. Just reporting the drop off in a flow chart seems disingenuous given the size of the drop off.

Thanks in advance.