Software for Bayesian data analysis

I was having a discussion with a statistician today. Both of us would like to go Bayesian. Kindly guide us regarding software like Stan or bugs or jagg, best resources for practical approach. And how to fix the prior. I would like to hear from you. We had approached a few statisticians in our state, but all discouraged us and wanted us to stick with frequentist approach.

Suggestions for starters:

  1. Learn R and its statistical formula syntax. For general advice on learning R see Chapter 1 of BBR
  2. Use the brms package in R which is a friendly general linear modeling front-end to Stan
  3. Later explore the rstanarm package in R

For more useful software and Bayesian methods links go here.

7 Likes

Warning: If you go the rstan or brms route, or indeed the PyStan route, and if you work for a company that is wary of GPL licensing, and GPL-3 in particular, you’re out of luck unless this is for an internal product only (and even then some companies want to avoid GPL-3 in case they want to commercialize internal code). For that reason, you might also want to look into PyMC3 and ScalaStan. There is also TensorFlow Probability, a fairly new entrant. If one of you is a decent software engineer, you could use CmdStan with your own wrappers. I think the next version of PyStan won’t be GPL-3. I hope, one day, nothing is GPL-3. No no, I kid.

Also, if you are going to be estimating discrete parameters, and you aren’t comfortable with marginalizing them out, then Stan or any other HMC-based algorithm isn’t going to be preferred and you might want to go the JAGS route. Although Bob Carpenter, who is smarter than me, has this to say about (if done right) marginalizing being better for a variety of reasons.

IMHO any company that worries about that has bigger problems.

You might be right. Nevertheless, for practitioners seeking to do the science correctly in a corporate environment under these preferred-license constraints, it’s something we have no choice but to deal with.

i saw a bayesian analysis this week that used jasp https://jasp-stats.org/

this made me want to play around with it

2 Likes

Personally, I love rstan in R (there’s of course also pystan and plugins for Stan in other things) due to the flexibility it offers (specify anything you can write a likelihood and a prior for, if you want including integrals and differential equations!). If you do standard models of course you’d be crazy to re-implement them, if brms or rstanarm cover them. There’s some clever tricks you may have to do with how to parametrize models/ make things numerically stable and these two packages implement all of that for you. brms is more flexible, while rstanarm is pretty nice for a easy out of the box default analysis. By the way, if you go with R look into the tidybayes and bayesplot packages for looking at the results.

SAS tends to have good samplers for Bayesian versions of standard models, if you are fine with restricted prior options. Their general MCMC sampler in PROC MCMC is mostly useful for toy examples but tends to struggle with more complex things. I still don’t understand why they did not turn the procedure into a front-end for Stan.

2 Likes

Yes, rstan and pystan use GPL. But you can use Stan (by itself or with R or Python) with the business-friendly BSD license! Just use CmdStan or the interfaces CmdStanPy or CmdStanR.

4 Likes

A good resource if of course the book from McElreath, Statistical Retinking (2nd edition to appear somewhen in 2020). There’s a bookdown page that “translate” the examples into brms. McElreath uses his own packages (that uses Stan for computation), but rstanarm and brms are far more used, I would say. Thus, the bookdown page is also good for learning: https://bookdown.org/connect/#/apps/1850/access

2 Likes