Fitting a distribution in Stan from scratch
[This article was first published on mages' blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week the French National Institute of Health and Medical Research (Inserm) organised with the Stan Group a training programme on Bayesian Inference with Stan for Pharmacometrics in Paris. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Daniel Lee and Michael Betancourt, who run the course over three days, are not only members of Stan’s development team, but also excellent teachers. Both were supported by Eric Novik, who gave an Introduction to Stan talk at the Paris Dataiku User Group last week as well.
![]() |
Eric Kramer (Dataiku), Daniel Lee, Eric Novik & Michael Betancourt (Stan Group) |
I have been playing around with Stan on and off for some time, but as Eric pointed out to me, Stan is not that kind of girl(boy?). Indeed, having spent three days working with Stan has revitalised my relationship. Getting down to the basics has been really helpful and I shall remember, Stan is not drawing samples from a distribution. Instead, it is calculating the joint distribution function (in log space), and evaluating the probability distribution function (in log space).
Thus, here is a little example of fitting a set of random numbers in R to a Normal distribution with Stan. Yet, instead of using the built-in functions for the Normal distribution, I define the log probability function by hand, which I will use in the model block as well, and even generate a random sample, starting with a uniform distribution. However, I do use pre-defined distributions for the priors.
Why do I want to do this? This will be a template for the day when I have to use a distribution, which is not predefined in Stan, e.g. the actuar package has some interesting candidates.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
functions { | |
// Define log probability density function | |
real myNormal_lpdf(real y, real mu, real sigma) { | |
return -log(2 * pi()) / 2 - log(sigma) | |
- square(mu - y) / (2 * sigma^2); | |
} | |
} | |
data { | |
int N; | |
real y[N]; | |
} | |
parameters { | |
real mu; | |
real<lower = 0> sigma; | |
} | |
model { | |
// Priors | |
mu ~ normal(0, 10); | |
sigma ~ cauchy(0,10); | |
// Likelihood | |
for(n in 1:N){ | |
target += myNormal_lpdf(y[n] | mu, sigma); | |
} | |
} | |
generated quantities{ | |
real y_ppc; | |
{ | |
real x; | |
x = uniform_rng(0, 1); | |
// Phi is the probit function in Stan, the CDF of the standardised Normal N(0,1) | |
// inv_Phi is the quantile function of the standardised Normal N(0,1) | |
y_ppc = mu + sigma * inv_Phi(x); | |
} | |
} |
Testing
I start off by generating fake data, a sample of 100 random numbers drawn from a Normal distribution with a mean of 4 and a standard deviation of 2. Note, the sample mean of the 100 figures is 4.2 and not 4.![]() |
Histogram of 100 random numbers drawn from N(4,2). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(rstan) | |
library(MASS) | |
set.seed(1) | |
y <- rnorm(100, 4, 2) | |
truehist(y, col="#B2001D") | |
lines(density(y), col="skyblue", lwd=2) | |
summary(y) | |
## Min. 1st Qu. Median Mean 3rd Qu. Max. | |
## -0.4294 3.0120 4.2280 4.2180 5.3830 8.8030 | |
ret <- stanc(file="NormaLDistribution.stan") # Check Stan file | |
ret_sm <- stan_model(stanc_ret = ret) # Compile Stan code | |
fit <- sampling(ret_sm, warmup=100, iter=600, seed=1, | |
data=list(y, N=length(y))) | |
stan_trace(fit, inc_warmup = TRUE) | |
stan_hist(fit) | |
print(fit, probs=c(0.025, 0.5, 0.975)) | |
## Inference for Stan model: NormaLDistribution. | |
## 4 chains, each with iter=600; warmup=100; thin=1; | |
## post-warmup draws per chain=500, total post-warmup draws=2000. | |
## | |
## mean se_mean sd 2.5% 50% 97.5% n_eff Rhat | |
## mu 4.22 0.01 0.17 3.88 4.22 4.53 771 1 | |
## sigma 1.82 0.00 0.13 1.57 1.81 2.10 2000 1 | |
## y_ppc 4.22 0.04 1.86 0.70 4.21 7.88 2000 1 | |
## lp__ -200.32 0.03 0.97 -203.02 -200.02 -199.41 1016 1 | |
## | |
## Samples were drawn using NUTS(diag_e) at Mon Sep 26 07:25:09 2016. | |
## For each parameter, n_eff is a crude measure of effective sample size, | |
## and Rhat is the potential scale reduction factor on split chains (at | |
## convergence, Rhat=1). | |
summary(extract(fit, "y_ppc")[["y_ppc"]]) | |
## Min. 1st Qu. Median Mean 3rd Qu. Max. | |
## -1.483 2.959 4.213 4.222 5.451 10.540 | |
plot(ecdf(y), main="Posterior predictive check") | |
lines(ecdf(extract(fit, "y_ppc")[["y_ppc"]]), col="#B2001D") |
![]() |
Traceplot of 4 chains, including warm-up phase |
![]() |
Histograms of posterior parameter and predictive samples |
![]() |
Comparison of the emperical distributions |
y_ppc
) is wider, taking into account the uncertainty of the parameters. The interquartile range and mean of my initial fake data and the sample of the posterior predictive distribution look very similar. That’s good, my model generates data, which looks like the original data.Bayesian Mixer Meetup
Btw, tonight we have the 4th Bayesian Mixer Meetup in London.Session Info
R version 3.3.1 (2016-06-21) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.12 (Sierra) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MASS_7.3-45 rstan_2.12.1 StanHeaders_2.12.0 ggplot2_2.1.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.7 codetools_0.2-14 digest_0.6.10 grid_3.3.1 [5] plyr_1.8.4 gtable_0.2.0 stats4_3.3.1 scales_0.4.0 [9] labeling_0.3 tools_3.3.1 munsell_0.4.3 inline_0.3.14 [13] colorspace_1.2-6 gridExtra_2.2.1
To leave a comment for the author, please follow the link and comment on their blog: mages' blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.