Site icon R-bloggers

Comparing Stan to JAGS for Bayesian Inference (Part 1?)

[This article was first published on The Ubuntu R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Stan is a new, open source, Bayesian inference tool. Stan is based on the the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. In order to compare Stan with JAGS, a gibbs sampler approach to Bayesian inference, I used the classic WinBUGS rats example. The rats example uses a hierarchical model to look at the growth curves of 30 rats.

In this example, I used R with RStan and R2jags packages. I used the code on the RStan Getting Started and the examples from the JAGS sourceforge page as starting points. I used a very large number of samples to test run times. Data and model files needed are below.

rats.csv

rats.bug

rats.txt

rats.stan

Both RStan and R2jags use the same list structure for the model data. In in R2jags, you need to specify which variables you wish to monitor, while RStan defaults to all of the variables. Both require an external model file and have very similar commands to fit the model. So, within R, both packages are very similar.

When you run the code, you will note that JAGS starts the sampling immediately, while Stan requires a lengthy compilation step. You will also see that I used two different calls to “stan”.

  stan.fit <- stan(file = 'rats.stan', data = rats_dat, verbose = FALSE,
             warmup=1000,iter = 11000, n_chains = 3)
  stan.fit2 <- stan(fit=stan.fit, data = rats_dat, verbose = FALSE,
              warmup=1000,iter = 11000, n_chains = 3)

The first compiles the model and runs the sampler, while the second uses the model from the first command. This eliminates the need to recompile the model, saving a good chunk of time. Notice the first argument has changed from “file=” to “fit=”. I also discovered that adding the “thin” option to either “stan” call creates a segfault, which I have reported the the Stan maintainers.

The answers that each program produced were identical (to the first decimal, at least). Both gave means, standard deviations, and key quantiles of the posterior distributions of the parameters, as well as R-hat, a measure of convergence you would like close to 1. The plots are slightly different, with RStan color coding the R-hat value vs. reporting it for R2jags. The output from RStan does not include a DIC value, at least by default.

R2jags Output

Inference for Bugs model at "rats.bug", fit using jags,
 3 chains, each with 10000 iterations (first 1000 discarded), n.thin = 9
 n.sims = 3000 iterations saved
          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat
alpha.c   242.390   3.712 237.001 240.654 242.431 244.237 247.839 1.129
alpha0    106.316   4.508  99.444 103.915 106.331 108.864 113.246 1.043
beta.c      6.185   0.109   5.968   6.116   6.185   6.258   6.394 1.002
sigma       6.136   1.182   5.266   5.776   6.065   6.395   7.088 1.001
tau.alpha   0.005   0.001   0.003   0.004   0.005   0.006   0.008 1.001
tau.beta    4.137   1.538   1.904   3.073   3.902   4.904   7.857 1.001
tau.c       0.027   0.004   0.020   0.024   0.027   0.030   0.036 1.001
deviance  967.610  23.168 941.121 957.316 966.043 976.319 998.467 1.001
          n.eff
alpha.c    3000
alpha0     3000
beta.c     1400
sigma      3000
tau.alpha  3000
tau.beta   3000
tau.c      3000
deviance   3000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 268.5 and DIC = 1236.1
DIC is an estimate of expected predictive error (lower deviance is better).

RStan output

Inference for Stan model: anon_model.
3 chains: each with iter=11000; warmup=1000; thin=1; 11000 iterations saved.

                mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff
alpha[1]       239.9     0.0  2.7  234.6  238.1  239.9  241.7  245.1 15672
alpha[2]       247.8     0.0  2.7  242.6  246.0  247.8  249.6  253.1 16865
alpha[3]       252.4     0.0  2.7  247.1  250.6  252.4  254.2  257.7 20956
alpha[4]       232.6     0.0  2.7  227.3  230.8  232.5  234.3  237.8 21050
alpha[5]       231.6     0.0  2.7  226.4  229.8  231.6  233.4  236.8 20250
alpha[6]       249.7     0.0  2.7  244.5  247.9  249.7  251.5  255.1 18570
alpha[7]       228.7     0.0  2.7  223.4  226.9  228.7  230.5  234.0 21257
alpha[8]       248.4     0.0  2.7  243.0  246.6  248.4  250.2  253.7 16278
alpha[9]       283.3     0.0  2.7  278.0  281.5  283.3  285.1  288.6 19244
alpha[10]      219.2     0.0  2.7  213.9  217.4  219.2  221.0  224.6 17471
alpha[11]      258.2     0.0  2.7  252.9  256.4  258.2  260.1  263.6 21935
alpha[12]      228.2     0.0  2.7  222.9  226.4  228.2  230.0  233.5 19933
alpha[13]      242.4     0.0  2.7  237.1  240.6  242.4  244.2  247.7 14641
alpha[14]      268.3     0.0  2.7  262.9  266.4  268.3  270.1  273.7 20241
alpha[15]      242.8     0.0  2.7  237.6  241.0  242.8  244.6  248.0 14321
alpha[16]      245.3     0.0  2.7  239.9  243.5  245.3  247.1  250.5 15314
alpha[17]      232.2     0.0  2.7  227.0  230.4  232.2  234.0  237.4 20874
alpha[18]      240.5     0.0  2.7  235.2  238.6  240.5  242.3  245.6 16301
alpha[19]      253.8     0.0  2.7  248.5  252.0  253.8  255.6  259.1 22001
alpha[20]      241.6     0.0  2.7  236.2  239.8  241.6  243.4  246.9 14330
alpha[21]      248.5     0.0  2.7  243.3  246.8  248.5  250.3  253.8 19178
alpha[22]      225.2     0.0  2.7  220.0  223.4  225.2  227.1  230.5 19358
alpha[23]      228.5     0.0  2.7  223.2  226.7  228.5  230.3  233.8 19742
alpha[24]      245.1     0.0  2.7  239.9  243.3  245.1  246.9  250.2 14794
alpha[25]      234.5     0.0  2.7  229.2  232.7  234.5  236.2  239.7 18778
alpha[26]      254.0     0.0  2.7  248.7  252.2  254.0  255.8  259.2 19957
alpha[27]      254.4     0.0  2.7  249.1  252.6  254.4  256.2  259.7 21171
alpha[28]      243.0     0.0  2.7  237.7  241.2  243.0  244.8  248.3 14643
alpha[29]      217.9     0.0  2.7  212.6  216.1  217.9  219.7  223.3 18663
alpha[30]      241.4     0.0  2.6  236.3  239.6  241.4  243.2  246.6 16161
beta[1]          6.1     0.0  0.2    5.6    5.9    6.1    6.2    6.5  1717
beta[2]          7.1     0.0  0.3    6.6    6.9    7.1    7.2    7.5  1052
beta[3]          6.5     0.0  0.2    6.0    6.3    6.5    6.6    7.0  1218
beta[4]          5.3     0.0  0.3    4.8    5.2    5.3    5.5    5.8  1187
beta[5]          6.6     0.0  0.2    6.1    6.4    6.6    6.7    7.0  1936
beta[6]          6.2     0.0  0.2    5.7    6.0    6.2    6.3    6.7  1651
beta[7]          6.0     0.0  0.2    5.5    5.8    6.0    6.1    6.4  1702
beta[8]          6.4     0.0  0.2    5.9    6.3    6.4    6.6    6.9  1926
beta[9]          7.1     0.0  0.3    6.6    6.9    7.1    7.2    7.6  1444
beta[10]         5.8     0.0  0.2    5.4    5.7    5.8    6.0    6.3  1514
beta[11]         6.8     0.0  0.2    6.3    6.6    6.8    7.0    7.3   780
beta[12]         6.1     0.0  0.2    5.7    6.0    6.1    6.3    6.6  1924
beta[13]         6.2     0.0  0.2    5.7    6.0    6.2    6.3    6.6  1283
beta[14]         6.7     0.0  0.3    6.2    6.5    6.7    6.9    7.2  1523
beta[15]         5.4     0.0  0.3    4.9    5.2    5.4    5.6    5.9  1000
beta[16]         5.9     0.0  0.2    5.4    5.8    5.9    6.1    6.4  1920
beta[17]         6.3     0.0  0.2    5.8    6.1    6.3    6.4    6.8  2247
beta[18]         5.8     0.0  0.2    5.4    5.7    5.8    6.0    6.3  1694
beta[19]         6.4     0.0  0.2    5.9    6.3    6.4    6.6    6.9  1184
beta[20]         6.1     0.0  0.2    5.6    5.9    6.1    6.2    6.5  1666
beta[21]         6.4     0.0  0.2    5.9    6.2    6.4    6.6    6.9  1618
beta[22]         5.9     0.0  0.2    5.4    5.7    5.9    6.0    6.3  2248
beta[23]         5.8     0.0  0.2    5.3    5.6    5.8    5.9    6.2  1597
beta[24]         5.9     0.0  0.2    5.4    5.7    5.9    6.0    6.4  2203
beta[25]         6.9     0.0  0.2    6.4    6.7    6.9    7.1    7.4  1778
beta[26]         6.5     0.0  0.2    6.1    6.4    6.5    6.7    7.0  1912
beta[27]         5.9     0.0  0.2    5.4    5.7    5.9    6.1    6.4  1546
beta[28]         5.8     0.0  0.2    5.4    5.7    5.8    6.0    6.3  1776
beta[29]         5.7     0.0  0.2    5.2    5.5    5.7    5.8    6.2  1514
beta[30]         6.1     0.0  0.2    5.7    6.0    6.1    6.3    6.6  1630
mu_alpha       242.4     0.0  2.8  236.9  240.6  242.4  244.3  247.9 12061
mu_beta          6.2     0.0  0.1    6.0    6.1    6.2    6.3    6.4  2855
sigmasq_y       37.3     0.1  5.7   27.8   33.2   36.7   40.7   50.0  2879
sigmasq_alpha  219.1     1.2 64.6  125.7  173.1  208.1  252.0  374.3  2934
sigmasq_beta     0.3     0.0  0.1    0.1    0.2    0.3    0.3    0.5   408
sigma_y          6.1     0.0  0.5    5.3    5.8    6.1    6.4    7.1  2864
sigma_alpha     14.7     0.0  2.1   11.2   13.2   14.4   15.9   19.3  2966
sigma_beta       0.5     0.0  0.1    0.4    0.5    0.5    0.6    0.7   402
alpha0         106.3     0.1  3.7   99.0  103.9  106.3  108.8  113.6  5365
lp__          -438.4     0.1  6.9 -453.0 -442.7 -437.9 -433.5 -426.2  2614
              Rhat
alpha[1]         1
alpha[2]         1
alpha[3]         1
alpha[4]         1
alpha[5]         1
alpha[6]         1
alpha[7]         1
alpha[8]         1
alpha[9]         1
alpha[10]        1
alpha[11]        1
alpha[12]        1
alpha[13]        1
alpha[14]        1
alpha[15]        1
alpha[16]        1
alpha[17]        1
alpha[18]        1
alpha[19]        1
alpha[20]        1
alpha[21]        1
alpha[22]        1
alpha[23]        1
alpha[24]        1
alpha[25]        1
alpha[26]        1
alpha[27]        1
alpha[28]        1
alpha[29]        1
alpha[30]        1
beta[1]          1
beta[2]          1
beta[3]          1
beta[4]          1
beta[5]          1
beta[6]          1
beta[7]          1
beta[8]          1
beta[9]          1
beta[10]         1
beta[11]         1
beta[12]         1
beta[13]         1
beta[14]         1
beta[15]         1
beta[16]         1
beta[17]         1
beta[18]         1
beta[19]         1
beta[20]         1
beta[21]         1
beta[22]         1
beta[23]         1
beta[24]         1
beta[25]         1
beta[26]         1
beta[27]         1
beta[28]         1
beta[29]         1
beta[30]         1
mu_alpha         1
mu_beta          1
sigmasq_y        1
sigmasq_alpha    1
sigmasq_beta     1
sigma_y          1
sigma_alpha      1
sigma_beta       1
alpha0           1
lp__             1

Sample were drawn using NUTS2 at Wed Sep  5 21:55:07 2012.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

One interesting difference between the two outputs is the R-hat value for the alpha0 variable. Alpha0 is generated quantity, representing the value of the intercept when non-centered data is used. For R2jags, the value of R-hat is 1.228, while R-hat is 1 for RStan. A quick look at the output indicates that R2jags used a thin value of 9, while RStan defaults to 1 (and creates a segfault when changed).

There is a major speed difference between the two methods. Now, it should be noted that JAGS can be faster on some problems and Stan can be faster on others. On my machine, the JAGS sampler took 7.1 seconds, while the Stan sampler took 25 seconds, which was a bit shocking to me. And this was timed using the second “stan” call, when the Stan model was pre-compiled. This is something that needs to be explored further.

To leave a comment for the author, please follow the link and comment on their blog: The Ubuntu R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.