This is the place for questions and answers regarding the R rmsb package.

Earlier questions may be found here.

This is the place for questions and answers regarding the R rmsb package.

Earlier questions may be found here.

1 Like

I just read this great article by you and your colleagues, where you used the `rmsb`

package and stated:

Missing baseline covariate data were multiply imputed using predictive mean matching, and posterior stacking was used to combine bayesian posterior draws of each separately analyzed completed data set.(ref 34)

There is some R code in the Supplement (thanks!) showing the `rmsb::stackMI()`

function, defined in CRAN as:

Runs an rmsb package Bayesian fitting function such as blrm separately for each completed dataset given a multiple imputation result such as one produced by Hmisc::aregImpute. Stacks the posterior draws and diagnostics across all imputations, and computes parameter summaries on the

stacked posterior draws.

In your article, reference 34 is Using Stacking to Average Bayesian Predictive Distributions (with Discussion) by Yao el al., where they recommend âstacking of predictive distributionsâ. Aki Vehtari explained, in the Stan Forum (here and here), that such method is implemented through `loo::loo_model_weights(..., method = "stacking")`

(or even `loo::stacking_weights`

) and then applying the estimated weights when generating draws from the posterior predictive distribution (with, eg `rstanarm::posterior_predict`

).

However, when looking into `rmsb::stackMI`

source code, I donât see any reference to these functions from R package `loo`

.

Two questions:

- Did you apply the method of stacking to average predictive distributions in your article?
- If so, how can one apply it through
`rmsb`

?

As far as I understand, it seems `rmsb::stackMI`

applies a very similar approach to what `brms::brm_multiple`

does:

Here, pooling results of multiple imputed data sets is simply achieved by combining the posterior draws of the submodels.

which would be a little bit different from what Yao et al. proposed.

1 Like

My understanding of this version of âstackingâ is as an alternative to model averaging, which would be used in a situation where you have multiple candidate models and want to weight them based on their estimated out of sample predictive performance.

In the MI case we have a single model and just pass the k multiply imputed datasets in order to propagate uncertainty in the imputed data. In a frequentist framework you would combine your parameters and SEs with Rubinâs rules but for Bayes you can simply stack (as in rbind) the posteriors together. You can also do this in a model based way (eg place on a model on the missing data) but this can be challenging for many reasons.

Overall I think itâs just an unfortunate convergence of similar terms describing very different end goals.

1 Like

I see, it makes sense.

To corroborate this, I think there is no function to extract draws from the posterior predictive distribution with Bayesian models fitted with `rmsb`

. I believe the reason for this is that there isnât a `generated quantities`

block in rmsbâs Stan codes.

If so, it wouldnât be even possible to apply Yao et al.'s method with `rmsb`

.

I think you could use log posterior density + loo to get the stacking weights and write a function to draw from the posterior predictive yourself based on posterior draws provided by rmsb if you really wanted so itâs probably not fair to say it wouldnât be possible.

1 Like

Yes, good point. The `generated quantities`

approach is the one recommended by the Stan team, but not the only one. Can I say that your approach would be the *âPredict outside of Stanâ* one, as stated here?

Yup that sounds right! Just donât ask me how to do it with ordinal models. You might be able to steal some ideas from the ordered probit meta-analysis models here: https://www.ncbi.nlm.nih.gov/books/NBK310366/pdf/Bookshelf_NBK310366.pdf by looking at how they model the actual likelihood.

1 Like

For multiple imputation the stacking just involves draws from posterior distributions. For the COVID-19 prone positioning paper here are code excepts:

```
set.seed(21)
#Multiple Imputation
mi <- aregImpute(~ total_outcome + Month + Baseline + Treatment + Age + Sex + Race + BMI + Elix, data = day5_full, n.impute = 5, pr = FALSE)
fitb_mi <- stackMI(total_outcome ~ Treatment + Baseline + Month + Age + Sex + Site + Race + BMI + Elix, blrm, mi, iter = 4000, warmup = 2000, chain = 4, loo = T, cores = 4, data = day5_full)
```

3 Likes

Thanks. Similar to brms::brm_multiple() then.

Should I use the -2log likelihood from (method=âoptimizingâ) to determine the most appropriate standard deviation prior? In your vignette example, the SD=100 did not have the lowest LL, compared to SD=10(rmsb Package Examples). Why did you still use SD=100?

Priors represent prior information and are not typically chosen to fit data at hand. Priors are chosen to represent the current state of knowledge and sometimes to make procedures perform well, e.g., converge quickly.

1 Like

I would have to reprogram some things. I think itâs an attractive option. I looked at it once and found a restriction that would affect me, but I canât remember what that restriction was.

1 Like

It would be a nice addition. For example, McElreath has shifted his rethinking package entirely to cmdstanr.

1 Like

I think I remember the issue: I donât think you can save the compiled Stan code, which makes the start-up time much longer. If I understand it, `cmdstanr`

is for passing ânewâ Stan source code to Stan. Thatâs a fatal problem, if I have it right.

1 Like

Hi, I am having issues with adequate model sampling in a Proportional Odds with random intercept.

Whenever I add the â`cluster(...)`

â argument, this error shows up:

Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.

Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.

Interestingly, I have no issues when fitting the same model with `brms`

. Please see full code here.

How can I solve this problem?

Ha! Found the culprit.

`rmsb`

uses a âDirichlet distribution for the cell probabilities, which induce a more complex prior distribution for the intercepts (`iprior=0`

, the default)â.

Yet, `brms`

applies a `student_t(3, 0, 2.5)`

distribution for the intercepts.

When I mimic this in rmsb, sampling runs smoothly:

```
PO_rmsb <- rmsb::blrm(y ~ x + cluster(study),
# intercept parameters prior: t-distribution with 3 d.f.
iprior = 2,
# scale parameter for the t-distribution for priors for the intercepts
ascale = 2.5,
# normal(0, 1.5) prior for "x"
priorsd = 1.5,
# random effect prior: half-t distribution with 4 d.f.
psigma = 1,
# assumed mean of the prior distribution of the RE SD
rsdmean = 0,
# assumed scale for the half t distribution for the RE SD
rsdsd = 2.5,
seed = 123,
data=d)
```

However, I wouldnât expect this behavior with the Dirichlet distribution.

What do you think @f2harrell?

Great detective work! I never put this sampling problem (which Iâve experienced myself) with the Dirichlet default. The Dirichlet default makes posterior modes equal MLEs when there are no random effects. I would love for someone to figure out why Dirichlet interferes with random effects which should just be added on top of the induced logit scale intercepts that are created by Dirichlet.

1 Like

I wish I had the knowledge to help you with that.