Post 3: Setting up the sampler and visualizing its output
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the chapter, we argue that a useful way to develop a Metropolis-Hastings (MH) within Gibbs sampler is to split the code into two levels. The top level is the “shell” of the MH within Gibbs algorithm, which sets up the data structures, implements burn-in and thinning of the chain, and calls a lower-level function to do the actual sampling from the complete conditionals. In this way, we can test the shell of the algorithm separately from testing the complete conditional samplers.
This post quotes the code presented in the chapter, discusses a data structure choice, gives an example of how to run the “empty” shell, and gives an example of how to visualize output from the shell.
MH within Gibbs 2PL shell
The code that we present in the chapter for the shell is:
## Define a complete run of the MH within Gibbs sampler run.chain.2pl <- function(M.burnin, M.keep, M.thin, U.data, hyperpars, th.init, a.init, b.init, s2.init, MH.th, MH.a, MH.b, verbose=FALSE) { ## Define and initialize the list of things to keep track of in ## the "current state" of the chain cur <- list(th=th.init, a=a.init, b=b.init, s2=s2.init, hyperpars=hyperpars, MH = list(th=MH.th, a=MH.a, b=MH.b), ACC= list(th=0,th.n=0, a=0,a.n=0, b=0,b.n=0 )) ## Define matrix to store MCMC results... chain.keep <- matrix( NA, ncol = M.keep, nrow = length(th.init) + length(a.init) + length(b.init) + length(s2.init)) rownames(chain.keep) <- c( paste('theta.abl', 1:length(th.init)), paste('a.disc', 1:length(a.init)), paste('b.diff', 1:length(b.init)), 'sig2.theta') # Burn-in phase: do not keep these results if( M.burnin > 0 ) { for(ii in 1:M.burnin ) { cur <- blocked.mcmc.update(U.data, cur) }} # Converged phase: Keep these results after thinning for (m in 1:M.keep) { # Skip the "thinned" pieces of the chain if( M.thin > 1 ) { for( ii in 1:(M.thin-1) ) { cur <- blocked.mcmc.update(U.data, cur) }} # Generate a "kept" update and save its results cur <- blocked.mcmc.update(U.data, cur) chain.keep[,m] <- c( cur$th, cur$a, cur$b, cur$s2 ) if ( m %% 100 == 0) { print(m) # Adaptive tuning would go here. } } if (verbose) { cat(paste("Average acceptance rates:", "n theta.abl:", round(cur$ACC$th / cur$ACC$th.n,3), "n a.disc: ", round(cur$ACC$a / cur$ACC$a.n,3), "n b.diff: ", round(cur$ACC$b / cur$ACC$b.n,3), "n")) } return( chain.keep ) }
Where the blocked.mcmc.update
is defined with respect to the complete conditional samplers
## Define a single iteration of the MH within Gibbs sampler blocked.mcmc.update <- function( U.data, cur){ cur <- sample.th(U.data, cur) cur <- sample.a(U.data, cur) cur <- sample.b(U.data, cur) cur <- sample.s2(U.data, cur) return(cur) }
which are, for now, defined as dummy samplers
## Define the dummy complete conditional sampler dummy.cc.sampler <- function(U.data, cur){ return(cur) } ## Set all complete conditional samplers to the dummy sampler sample.th <- dummy.cc.sampler sample.a <- dummy.cc.sampler sample.b <- dummy.cc.sampler sample.s2 <- dummy.cc.sampler
Please see the chapter for details on the setup.
A technical aside about cur
Users of programming languages like C or C++ may be confused as to why we chose to pass around the omnibus object cur
, instead of passing the output matrix by reference to the complete conditional samplers and having them update it directly.
For technical and historical reasons, R implements a hybrid between pass by reference and pass by value. In R, an object is passed by reference if the function does not change the object, and passed value if it does. Therefore we use a small, omnibus object (cur
) to be copied about as the chain moves through a single iteration of blocked.mcmc.update
. This is much faster than passing the whole output matrix.
We could improve speed by mimicking pass by reference behavior with a package like R.oo, but at a large sacrifice to code readability. Therefore we define the cur
object to contain only the current state of the chain and the minimal additional information to calculate updates and monitor their success.
Running the shell
In Post 1 we discussed specific choices for the hyper priors. We can encode them in R as follows:
## Define the hyper parameters to values from Post 1 hyperpars <- list( mu.a = 1.185, s2.a = 1.185, s2.b = 100, alpha.th = 1, beta.th = 1 )
In Post 2 we generated fake data. We can run the sampler on the fake data as follows:
## Load the necessary code from this and previous posts source("http://mcmcinirt.stat.cmu.edu/setup/post-3.R") ## Set the seed to keep results reproducible set.seed(314159) ## Run the sampler at the true, simulated values ## for 1000 iterations run.A <- run.chain.2pl( ## No burnin, keep 1000 iterations, do not thin M.burnin = 0, M.keep = 1000, M.thin = 1, ## Use the fake data simulated in Post #2 U.data = U, ## Pass in the hyperpriors from Post #1 hyperpars = hyperpars, ## Use the true values from Post #2 for the initial values th.init = theta.abl, a.init = a.disc, b.init = b.diff, s2.init = sig2.theta, ## These parameters are not yet defined. Pass in NULL. MH.th=NULL, MH.a=NULL, MH.b=NULL)
Exploring the chain
Since run.A
is a large matrix, printing it to the screen is not helpful. We can explore it as follows:
## run.A is a large matrix dim(run.A) ## [1] 2061 1000 ## We can select a single parameter, by selecting a ## row of the matrix. For example: run.A['theta.abl 1',] ## selects the row for the first person ability parameter ## However, since there are 1000 columns, even one ## row is too much to print to the screen. Here ## we select columns 1 through 5 run.A['theta.abl 1', 1:5] ## [1] -1.682354 -1.682354 -1.682354 -1.682354 -1.682354 ## ## Which is more manageable. Note, that as expected, they ## are all equal. ## We can also define a vector of names first.names <- c('theta.abl 1', 'a.disc 1', 'b.diff 1', 'sig2.theta') ## and use it to look at the output run.A[first.names, 1:5] ## theta.abl 1 -1.6823542 -1.6823542 -1.6823542 -1.6823542 -1.6823542 ## a.disc 1 0.7120615 0.7120615 0.7120615 0.7120615 0.7120615 ## b.diff 1 -3.0000000 -3.0000000 -3.0000000 -3.0000000 -3.0000000 ## sig2.theta 1.5625000 1.5625000 1.5625000 1.5625000 1.5625000 ## ## Note, that as expected each parameter stays constant.
Specifying a different vector for each set of outputs can be tedious. We define a function get.2pl.params
to build the selection vector:
## get.2pl.params takes ranges of parameters and appends the ## appropriate variable names. get.2pl.params <- function( theta.range, a.range, b.range, sig2.range) { ## Append the variable names to the ranges if( !is.null(theta.range) ) { theta.range <- paste('theta.abl', theta.range) } if( !is.null(a.range ) ) { a.range <- paste('a.disc' , a.range ) } if( !is.null(b.range ) ) { b.range <- paste('b.diff' , b.range ) } if( !is.null(sig2.range ) ) { sig2.range <- 'sig2.theta' } return( c(theta.range, a.range, b.range, sig2.range )) }
We test that the
get.2pl.params
function is working by comparing its output to the first.names
variable we constructed earlier.## N.B. get.2pl.params can give the same values as first.names all.equal( get.2pl.params( 1, 1, 1, 1 ), first.names) ## [1] TRUE ## So now we can look at the first five samples of the first ## three item discrimination parameters run.A[ get.2pl.params(NULL, 1:3, NULL, NULL), 1:5] ## [,1] [,2] [,3] [,4] [,5] ## a.disc 1 0.7120615 0.7120615 0.7120615 0.7120615 0.7120615 ## a.disc 2 1.1440699 1.1440699 1.1440699 1.1440699 1.1440699 ## a.disc 3 0.7326655 0.7326655 0.7326655 0.7326655 0.7326655
Visualizing sampler output
We can use coda to analyze and visualize the output of the sampler. Because we chose a matrix for the output, converting run.A
to coda
's format is trivial:
require(coda) ## If the above command fails, you must install the ## coda library. To do so, un-comment the install.packages ## line below, run it, and follow the directions it gives. ## ## install.packages('coda') ## N.B. t(...) takes the transpose of a matrix run.A.mcmc <- mcmc( t(run.A) )
A coda mcmc
object behaves like a matrix. We can use the get.2pl.params
function to extract out the variables we want. Note that for the mcmc
object, the variables are in the columns.
run.A.mcmc[1:5, get.2pl.params(1,1,1,1)] ## theta.abl 1 a.disc 1 b.diff 1 sig2.theta ## [1,] -1.682354 0.7120615 -3 1.5625 ## [2,] -1.682354 0.7120615 -3 1.5625 ## [3,] -1.682354 0.7120615 -3 1.5625 ## [4,] -1.682354 0.7120615 -3 1.5625 ## [5,] -1.682354 0.7120615 -3 1.5625
A coda mcmc
object has a default plotting method. Therefore it is trivial to produce this graph
with one line of R code
plot( run.A.mcmc[, get.2pl.params(1,1,1,1)], density=FALSE )
Run
help(plot.mcmc)
for the help page, which gives details on how to customize the graph.
Note that the graph is not particularly impressive because the sampler is not doing anything yet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.