First approach to the rmsb package - issues

f2harrell · September 21, 2020, 6:43pm

The simulation result would do nothing more than reflect the true model that was used to simulate its data.

albertoca · September 21, 2020, 7:39pm

Can’t you simulate random continuous versus binary variables and test whether there are more deviations from the PO in one case or the other? My guess is that when you use quadratic terms or splines, the PO fails much more.

f2harrell · September 22, 2020, 12:49pm

No please consider my earlier point. Whatever bias you put into the simulation model will be revealed by such simulation. Not really worth doing IMHO.

albertoca · September 22, 2020, 12:59pm

So how can you investigate if continuous variables modeled as splines or quadratic terms generate more PO violations? Don’t you have this intuition?

f2harrell · September 22, 2020, 1:09pm

You would have to assemble lots of real datasets to study that properly. My intuition is that it does not matter whether a variable is continuous or categorical, in terms of likelihood of a PO violation. What matters more is the strength of the predictor,as violations of assumptions typically matter more for stronger predictors.

albertoca · October 4, 2020, 7:41pm

I have read the entire chapter 7. I loved it, a new world to learn these years. I guess the compareBmods function gives you weights based on the WAIC differences of each model with the best model (formula in page 229 in the McElreath’s book), is that it? These weights indicate the probability that each model is the best, is that right?

f2harrell · October 5, 2020, 11:47am

That is my understanding. But look at the code underneath compareBmods to check.

albertoca · October 5, 2020, 11:58am

In the case of the previous example, I have compared two models, one additive with a treatment + age formula, versus one with an interaction with a treatment * age formula. In principle, compareBmods favours the additive model without interaction. However, WAICs compare the accuracy of the prediction. Here I do not want to predict, but make inferences about the values of the therapeutic effect in each age range. For this I have to include the interaction necessarily even though the most likely data generation process likely does not. How do you solve this dissonance?

albertoca · October 21, 2020, 8:33pm

I have a doubt.
When I use the predict function I get negative probabilities for some levels of the end point.
This determines that the cumulative probability, for example, the probability of quality of life of at least y (Pr >=y), does not decrease monotonically when increasing y. For example:

> predict(bs_final,newdata,  type='fitted.ind') 
          y x   Mean  Lower  Upper
1     HE6=0 1  0.624  0.131  0.985
2   HE6=8.3 1 -0.094 -0.177 -0.005
3  HE6=16.7 1 -0.053 -0.174  0.068
4    HE6=25 1 -0.118 -0.202 -0.013
5  HE6=33.3 1  0.035 -0.060  0.144
6  HE6=41.7 1 -0.118 -0.212 -0.020
7    HE6=50 1  0.172  0.061  0.290
8  HE6=58.3 1 -0.050 -0.136  0.045
9  HE6=66.7 1  0.043 -0.044  0.129
10   HE6=75 1 -0.024 -0.106  0.062
11 HE6=83.3 1  0.180  0.063  0.289
12 HE6=91.7 1 -0.009 -0.094  0.071
13  HE6=100 1  0.411  0.054  0.774

f2harrell · October 22, 2020, 11:32am

Provide the simplest possible reproducible example using self-generated data and I’ll get to work quickly on this worrisome problem.

albertoca · October 22, 2020, 1:36pm

Done in github, thanks.

pabloavila · December 30, 2020, 6:07pm

Quick question about rmsb package. Is it possible to get the partial effect in logistic regression as estimated probability (i.e: ggplot(Predict(model, fun=plogis) ) as in rms?

albertoca · December 30, 2020, 6:09pm

Yes, it is also possible to do that.

pabloavila · December 30, 2020, 6:14pm

I am reproducing the example of binary logistic regression in https://hbiostat.org/R/rms/blrm.html
I keep getting this error:

Error in predict.blrm(fit, settings, type = “lp”, fun = fun, funint = funint, :
specifying fun= makes no sense with a binary response

albertoca · December 30, 2020, 6:15pm

Let’s wait to professor Harrell response.

f2harrell · December 30, 2020, 7:50pm

In that context the fun argument is only for complex functions that are not just simple transformations (e.g., expit or inverse logit), i.e., for converting a proportional odds model prediction to a predicted mean or quantile. Posterior distributions for simple functions such as expit are already very easy to get.

pabloavila · December 30, 2020, 9:02pm

Thank you. It’s just that I saw it used in the bayesian analysis of the titanic case study so I was wondering if I was doing something wrong

f2harrell · December 30, 2020, 10:01pm

Predict() takes fun=plogis etc. State the command you are trying to use.

pabloavila · December 30, 2020, 10:27pm

I was reproducing the binary logistic regression example at
https://hbiostat.org/R/rms/blrm.html.

getHdata(support)
dd <- datadist(support); options(datadist=‘dd’)
bsup <- blrm(hospdead ~ dzgroup + rcs(crea, 5) + rcs(meanbp, 5), data=support, file=‘bsup.rds’)
ggplot(Predict(bsup, fun=plogis))

I get this error:
“Error in predict.blrm(fit, settings, type = “lp”, fun = fun, funint = funint, :
specifying fun= makes no sense with a binary response”

f2harrell · December 31, 2020, 8:14pm

I don’t see Predict(bsup, fun=plogis) in the script but you are right this is a bug. I’ll fix this in the next release to CRAN. When that fix is made you’ll need to use Predict(..., fun=..., funint=FALSE) to handle simple functions (functions other than Mean, Quantile, ExProb). If you use Linux (only) you can get a new binary rmsb package here.