Hi there,

I’m currently working on a research project where we’re developing a predictive algorithm for **health utility index mark 3** (HUI3). HUI3 is a continuous health related quality of life measure whose range is the closed interval [-0.36, 1]. A value of 1 implies perfect health, a value of 0 implies death and a value of -0.36 implies a state worse than death. Couple of points to take note of when modelling HUI3:

- Its highly skewed to the right with peaks at 1 and 0.97
- Its range is limited to -0.36 and 1

I was thinking of using a beta regression model to model the conditional mean of HUI3 using the **beta reg** package in R.

- I was going to use the transformation mentioned in the betareg paper to restrict HUI3 to be between (0,1). This would be (y∗(n−1)+0.5)/n where n is the sample size (Reference: Smithson M, Verkuilen J (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent Variables." Psychological Methods, 11(1), 54{71.)
- I was going to use the log-log link function due to the large number of values close to 1 (They recommend this in the paper)

Some points that I’m concerned/thinking about:

- What assumptions do I need to test for in a beta regression model and how do I test them?
- Should I use a variable or fixed precision parameter? Or how would I go about deciding this?

Do you have any thoughts/have you worked with HUI3 before?

Thanks for your help!