First approach to the rmsb package - issues

The simulation result would do nothing more than reflect the true model that was used to simulate its data.

Can’t you simulate random continuous versus binary variables and test whether there are more deviations from the PO in one case or the other? My guess is that when you use quadratic terms or splines, the PO fails much more.

No please consider my earlier point. Whatever bias you put into the simulation model will be revealed by such simulation. Not really worth doing IMHO.

So how can you investigate if continuous variables modeled as splines or quadratic terms generate more PO violations? Don’t you have this intuition?

You would have to assemble lots of real datasets to study that properly. My intuition is that it does not matter whether a variable is continuous or categorical, in terms of likelihood of a PO violation. What matters more is the strength of the predictor,as violations of assumptions typically matter more for stronger predictors.

I have read the entire chapter 7. I loved it, a new world to learn these years. I guess the compareBmods function gives you weights based on the WAIC differences of each model with the best model (formula in page 229 in the McElreath’s book), is that it? These weights indicate the probability that each model is the best, is that right?

2 Likes

That is my understanding. But look at the code underneath compareBmods to check.

1 Like

In the case of the previous example, I have compared two models, one additive with a treatment + age formula, versus one with an interaction with a treatment * age formula. In principle, compareBmods favours the additive model without interaction. However, WAICs compare the accuracy of the prediction. Here I do not want to predict, but make inferences about the values of the therapeutic effect in each age range. For this I have to include the interaction necessarily even though the most likely data generation process likely does not. How do you solve this dissonance?

I have a doubt.
When I use the predict function I get negative probabilities for some levels of the end point.
This determines that the cumulative probability, for example, the probability of quality of life of at least y (Pr >=y), does not decrease monotonically when increasing y. For example:

> predict(bs_final,newdata,  type='fitted.ind') 
          y x   Mean  Lower  Upper
1     HE6=0 1  0.624  0.131  0.985
2   HE6=8.3 1 -0.094 -0.177 -0.005
3  HE6=16.7 1 -0.053 -0.174  0.068
4    HE6=25 1 -0.118 -0.202 -0.013
5  HE6=33.3 1  0.035 -0.060  0.144
6  HE6=41.7 1 -0.118 -0.212 -0.020
7    HE6=50 1  0.172  0.061  0.290
8  HE6=58.3 1 -0.050 -0.136  0.045
9  HE6=66.7 1  0.043 -0.044  0.129
10   HE6=75 1 -0.024 -0.106  0.062
11 HE6=83.3 1  0.180  0.063  0.289
12 HE6=91.7 1 -0.009 -0.094  0.071
13  HE6=100 1  0.411  0.054  0.774

Provide the simplest possible reproducible example using self-generated data and I’ll get to work quickly on this worrisome problem.

1 Like

Done in github, thanks.

1 Like

Quick question about rmsb package. Is it possible to get the partial effect in logistic regression as estimated probability (i.e: ggplot(Predict(model, fun=plogis) ) as in rms?

Yes, it is also possible to do that.

I am reproducing the example of binary logistic regression in https://hbiostat.org/R/rms/blrm.html
I keep getting this error:

Error in predict.blrm(fit, settings, type = “lp”, fun = fun, funint = funint, :
specifying fun= makes no sense with a binary response

Let’s wait to professor Harrell response.

In that context the fun argument is only for complex functions that are not just simple transformations (e.g., expit or inverse logit), i.e., for converting a proportional odds model prediction to a predicted mean or quantile. Posterior distributions for simple functions such as expit are already very easy to get.

Thank you. It’s just that I saw it used in the bayesian analysis of the titanic case study so I was wondering if I was doing something wrong

Predict() takes fun=plogis etc. State the command you are trying to use.

I was reproducing the binary logistic regression example at
https://hbiostat.org/R/rms/blrm.html.

getHdata(support)
dd <- datadist(support); options(datadist=‘dd’)
bsup <- blrm(hospdead ~ dzgroup + rcs(crea, 5) + rcs(meanbp, 5), data=support, file=‘bsup.rds’)
ggplot(Predict(bsup, fun=plogis))

I get this error:
“Error in predict.blrm(fit, settings, type = “lp”, fun = fun, funint = funint, :
specifying fun= makes no sense with a binary response”

I don’t see Predict(bsup, fun=plogis) in the script but you are right this is a bug. I’ll fix this in the next release to CRAN. When that fix is made you’ll need to use Predict(..., fun=..., funint=FALSE) to handle simple functions (functions other than Mean, Quantile, ExProb). If you use Linux (only) you can get a new binary rmsb package here.