Bayesian Estimation by using rjags Package

Posted on April 9, 2022 by sang-heon lee in R bloggers | 0 Comments

[This article was first published on K & L Fintech Modeling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post shows how to use rjags R package to perform Bayesian MCMC estimation. As an example, we select a multiple linear regression but rjags can handle lots of highly non-linear models so that it can be extended to various modelings including hierarchical models.

Bayesian Estimation by using rjags Package

This post will use rjags R package to estimate a multiple linear regression model by Bayesian MCMC. Prior to installing rjags R package, jags should be installed at first as rjags is a kind of wrapper of jags.

https://sourceforge.net/projects/mcmc-jags/

Likelihood and Prior

The likelihood of a multiple linear regression model (k=2) is constructed from the following model specification. \((t=1,2,…,n)\)

\[\begin{align} Y_t &\sim \text{Normal}(\mu, \tau) \\ \mu &= \beta_0+\beta_1 X_{1t}+\beta_2 X_{2t} \\ \sigma &= 1/\sqrt{\tau} \end{align}\]
Here, \(\tau\) in \(\text{Normal}(\mu, \tau)\) is not a variance but a precision which is a reciprocal of variance (\(\tau = \frac{1}{\sigma^2}\)).

Priors for regression coefficients and the reciprocal of an error variance are given. \(\hat{\beta_0}, \hat{\beta_1}, \hat{\beta_2}\) are OLS estimates and have a normal prior but \(\tau\) is non-negative and has a gamma prior.

\[\begin{align} \beta_0 &\sim \text{Normal}(\hat{\beta_0}, 0.01) \\ \beta_1 &\sim \text{Normal}(\hat{\beta_1}, 0.01) \\ \beta_2 &\sim \text{Normal}(\hat{\beta_2}, 0.01) \\ \tau &\sim \text{Gamma}(0.01, 0.01) \end{align}\]
To use rjags, we have only to define model and each prior for coefficients without any lengthy derivations.

R code

As an example, we take an AR(2) model for the U.S. real GDP growth rate. In the next R code, jags.script is a main part which consists of a likelihood and priors as string. it is the same as the above equations essentially. When model or prior are changed, we need to modify this model string (jags.script) properly.

jags.script <– “
    model{
    
        # likelihood
        for( i in 1:n) {
            y[i] ~ dnorm(mu[i], tau)
            mu[i] <- beta0 +beta1*x[i,1] + beta2*x[i,2]
        }
    
        # priors
        beta0 ~ dnorm(b_guess[1], 0.1)
        beta1 ~ dnorm(b_guess[2], 0.1)
        beta2 ~ dnorm(b_guess[3], 0.1)  
        tau   ~ dgamma(.01,.01)
        
        # transform
        sigma <- 1 / sqrt(tau)
    }
    “ 
Colored by Color Scripter
cs

#========================================================#
# Quantitative ALM, Financial Econometrics & Derivatives 
# ML/DL using R, Python, Tensorflow by Sang-Heon Lee 
#
# https://kiandlee.blogspot.com
#——————————————————–#
# rjags example for Bayesian multiple regression
#========================================================#
graphics.off(); rm(list = ls())
 
library(readxl)
library(rjags)
library(coda)
 
setwd(“D:/SHLEE/blog/bayesian/rjags”)
 
#—————————————————–
# read data and transform yoy
#—————————————————–
# read data
df.data  <– read_excel(“US_GDP.xlsx”,“gdpc”,“A11:B300”,
                       col_names=TRUE)
# YOY transform
df.yoy <– log(df.data$GDPC1[5:nrow(df.data)]
             /df.data$GDPC1[1:(nrow(df.data)–4)])
nyoy <– length(df.yoy)
# Y, constant, Y(-1) Y(-2)
df.yx <– data.frame(GDP = df.yoy[3:nyoy], 
                    GDP_lag1 = df.yoy[2:(nyoy–1)], 
                    GDP_lag2 = df.yoy[1:(nyoy–2)])
 
#—————————————————–
# regression by using lm
#—————————————————–
lm_est <– lm(GDP ~ ., data = df.yx)
 
#—————————————————–
# bayesian regression by using rjags
#—————————————————–
 
# Prepare data and additional information
lt.Data <– list(x = df.yx[,–1], y = df.yx[,1],
                n = nrow(df.yx))
 
# for use additional information inside model
lt.Data$b_guess <– lm_est$coefficients
 
# Model string
jags.script <– “
    model{
    
        # likelihood
        for( i in 1:n) {
            y[i] ~ dnorm(mu[i], tau)
            mu[i] <- beta0 +beta1*x[i,1] + beta2*x[i,2]
        }
    
        # priors
        beta0 ~ dnorm(b_guess[1], 0.1)
        beta1 ~ dnorm(b_guess[2], 0.1)
        beta2 ~ dnorm(b_guess[3], 0.1)  
        tau   ~ dgamma(.01,.01)
        
        # transform
        sigma <- 1 / sqrt(tau)
    }
    “ 
 
# compile
mod1 <– jags.model(textConnection(jags.script), 
                   data = lt.Data, 
                   n.chains = 4, n.adapt = 1000)
 
# run
update(mod1, 4000)
 
# posterior sampling
mod1.samples <– coda.samples(
    model = mod1, 
    variable.names = c(“beta0”, “beta1”, “beta2”, “sigma”), 
    n.iter = 2000)
 
# plot trace and posterior density for each parameter
x11(); plot(mod1.samples)  
# descriptive statistics of posterior densities
sm <– summary(mod1.samples); sm
# correlation
cor(mod1.samples[[1]])
 
 
#—————————————————–
# mean parameters
#—————————————————–
# means of posteriors of parameters
beta0.post.mean <– sm$statistics[“beta0”, “Mean”]
beta1.post.mean <– sm$statistics[“beta1”, “Mean”]
beta2.post.mean <– sm$statistics[“beta2”, “Mean”]
sigma.post.mean <– sm$statistics[“sigma”, “Mean”]
 
beta_post <– rbind(beta0.post.mean,
                   beta1.post.mean,
                   beta2.post.mean,
                   sigma.post.mean)
 
# real data and fitted values
ypred <– as.matrix(cbind(1,df.yx[,–1]))%*%beta_post[–4]
 
x11(width = 16/2, height = 9/2)
matplot(cbind(ypred,df.yx[,1]), type=c(“l”,“p”), pch = 16,
        lty=c(1,1), lwd=3, col=c(“blue”, “green”))
 
# Frequentist and Bayesian coefficients
coefs <– rbind(c(lm_est$coefficients, sigma(lm_est)), 
               t(beta_post))
colnames(coefs) <– c(“const”, “beta1”, “beta2”, “sigma”)
rownames(coefs) <– c(“Frequentist”, “Bayesian”)
coefs
 
Colored by Color Scripter
cs

Except for jags.script, the remaining part is not changed. In particular, it is worth noting that variables in jags.script should be in lt.Data like

lt.Data$b_guess <– lm_est$coefficients
cs

.
Therefore when additional variables should be used in model string (jags.script), we should add that variables into that data list (lt.Data).

Estimation results contain of the following posterior distributions for each parameters.

Bayesian Estimation by using rjags Package R code

A summary statistics and average of posteriors can be obtained in the following way.

We can generate a fitted response variable with average estimates of parameters.

Concluding Remarks

This post shows how to use rjags R package to perform Bayesian MCMC estimation with multiple linear regression example. This is an easy tool for Bayesian analysis for us only if we understand the meaning of hyper-parameters of various prior distributions and use them without hesitation. \(\blacksquare\)

To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Bayesian Estimation by using rjags Package

Bayesian Estimation by using rjags Package

Likelihood and Prior

R code

Concluding Remarks

Related

Bayesian Estimation by using rjags Package

Likelihood and Prior

R code

Concluding Remarks

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)