This is the place for questions and answers regarding the R rmsb package.
Earlier questions may be found here.
This is the place for questions and answers regarding the R rmsb package.
Earlier questions may be found here.
I just read this great article by you and your colleagues, where you used the rmsb
package and stated:
Missing baseline covariate data were multiply imputed using predictive mean matching, and posterior stacking was used to combine bayesian posterior draws of each separately analyzed completed data set.(ref 34)
There is some R code in the Supplement (thanks!) showing the rmsb::stackMI()
function, defined in CRAN as:
Runs an rmsb package Bayesian fitting function such as blrm separately for each completed dataset given a multiple imputation result such as one produced by Hmisc::aregImpute. Stacks the posterior draws and diagnostics across all imputations, and computes parameter summaries on the
stacked posterior draws.
In your article, reference 34 is Using Stacking to Average Bayesian Predictive Distributions (with Discussion) by Yao el al., where they recommend âstacking of predictive distributionsâ. Aki Vehtari explained, in the Stan Forum (here and here), that such method is implemented through loo::loo_model_weights(..., method = "stacking")
(or even loo::stacking_weights
) and then applying the estimated weights when generating draws from the posterior predictive distribution (with, eg rstanarm::posterior_predict
).
However, when looking into rmsb::stackMI
source code, I donât see any reference to these functions from R package loo
.
Two questions:
rmsb
?As far as I understand, it seems rmsb::stackMI
applies a very similar approach to what brms::brm_multiple
does:
Here, pooling results of multiple imputed data sets is simply achieved by combining the posterior draws of the submodels.
which would be a little bit different from what Yao et al. proposed.
My understanding of this version of âstackingâ is as an alternative to model averaging, which would be used in a situation where you have multiple candidate models and want to weight them based on their estimated out of sample predictive performance.
In the MI case we have a single model and just pass the k multiply imputed datasets in order to propagate uncertainty in the imputed data. In a frequentist framework you would combine your parameters and SEs with Rubinâs rules but for Bayes you can simply stack (as in rbind) the posteriors together. You can also do this in a model based way (eg place on a model on the missing data) but this can be challenging for many reasons.
Overall I think itâs just an unfortunate convergence of similar terms describing very different end goals.
I see, it makes sense.
To corroborate this, I think there is no function to extract draws from the posterior predictive distribution with Bayesian models fitted with rmsb
. I believe the reason for this is that there isnât a generated quantities
block in rmsbâs Stan codes.
If so, it wouldnât be even possible to apply Yao et al.'s method with rmsb
.
I think you could use log posterior density + loo to get the stacking weights and write a function to draw from the posterior predictive yourself based on posterior draws provided by rmsb if you really wanted so itâs probably not fair to say it wouldnât be possible.
Yes, good point. The generated quantities
approach is the one recommended by the Stan team, but not the only one. Can I say that your approach would be the âPredict outside of Stanâ one, as stated here?
Yup that sounds right! Just donât ask me how to do it with ordinal models. You might be able to steal some ideas from the ordered probit meta-analysis models here: https://www.ncbi.nlm.nih.gov/books/NBK310366/pdf/Bookshelf_NBK310366.pdf by looking at how they model the actual likelihood.
For multiple imputation the stacking just involves draws from posterior distributions. For the COVID-19 prone positioning paper here are code excepts:
set.seed(21)
#Multiple Imputation
mi <- aregImpute(~ total_outcome + Month + Baseline + Treatment + Age + Sex + Race + BMI + Elix, data = day5_full, n.impute = 5, pr = FALSE)
fitb_mi <- stackMI(total_outcome ~ Treatment + Baseline + Month + Age + Sex + Site + Race + BMI + Elix, blrm, mi, iter = 4000, warmup = 2000, chain = 4, loo = T, cores = 4, data = day5_full)
Thanks. Similar to brms::brm_multiple() then.
Should I use the -2log likelihood from (method=âoptimizingâ) to determine the most appropriate standard deviation prior? In your vignette example, the SD=100 did not have the lowest LL, compared to SD=10(rmsb Package Examples). Why did you still use SD=100?
Priors represent prior information and are not typically chosen to fit data at hand. Priors are chosen to represent the current state of knowledge and sometimes to make procedures perform well, e.g., converge quickly.
I would have to reprogram some things. I think itâs an attractive option. I looked at it once and found a restriction that would affect me, but I canât remember what that restriction was.
It would be a nice addition. For example, McElreath has shifted his rethinking package entirely to cmdstanr.
I think I remember the issue: I donât think you can save the compiled Stan code, which makes the start-up time much longer. If I understand it, cmdstanr
is for passing ânewâ Stan source code to Stan. Thatâs a fatal problem, if I have it right.
Hi, I am having issues with adequate model sampling in a Proportional Odds with random intercept.
Whenever I add the âcluster(...)
â argument, this error shows up:
Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Interestingly, I have no issues when fitting the same model with brms
. Please see full code here.
How can I solve this problem?
Ha! Found the culprit.
rmsb
uses a âDirichlet distribution for the cell probabilities, which induce a more complex prior distribution for the intercepts (iprior=0
, the default)â.
Yet, brms
applies a student_t(3, 0, 2.5)
distribution for the intercepts.
When I mimic this in rmsb, sampling runs smoothly:
PO_rmsb <- rmsb::blrm(y ~ x + cluster(study),
# intercept parameters prior: t-distribution with 3 d.f.
iprior = 2,
# scale parameter for the t-distribution for priors for the intercepts
ascale = 2.5,
# normal(0, 1.5) prior for "x"
priorsd = 1.5,
# random effect prior: half-t distribution with 4 d.f.
psigma = 1,
# assumed mean of the prior distribution of the RE SD
rsdmean = 0,
# assumed scale for the half t distribution for the RE SD
rsdsd = 2.5,
seed = 123,
data=d)
However, I wouldnât expect this behavior with the Dirichlet distribution.
What do you think @f2harrell?
Great detective work! I never put this sampling problem (which Iâve experienced myself) with the Dirichlet default. The Dirichlet default makes posterior modes equal MLEs when there are no random effects. I would love for someone to figure out why Dirichlet interferes with random effects which should just be added on top of the induced logit scale intercepts that are created by Dirichlet.
I wish I had the knowledge to help you with that.