SAS PROC MCMC example 12 in R: Change point model
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I restarted at working my way through the PROC MCMC examples. The SAS manual describes this example: Consider the data set from Bacon and Watts (1971), where is the logarithm of the height of the stagnant surface layer and the covariate is the logarithm of the flow rate of water. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It is a simple example. It provided no problems at all for STAN and Jags. For LaplacesDemon on the other hand I had some problems. It took me quite some effort to obtain samples which seemed to be behaving. I did not try to do this in MCMCpack, but noted that the function MCMCregressChange() uses a slightly different model. The section below shows first the results, at the bottom the code is given.
Previous post in the series PROC MCMC examples programmed in R were: example 61.1: sampling from a known density, example 61.2: Box Cox transformation, example 61.5: Poisson regression, example 61.6: Non-Linear Poisson Regression, example 61.7: Logistic Regression Random-Effects Model, and example 61.8: Nonlinear Poisson Regression Multilevel Random-Effects Model
Data
Data are read as below.observed <-
‘1.12 -1.39 1.12 -1.39 0.99 -1.08 1.03 -1.08
0.92 -0.94 0.90 -0.80 0.81 -0.63 0.83 -0.63
0.65 -0.25 0.67 -0.25 0.60 -0.12 0.59 -0.12
0.51 0.01 0.44 0.11 0.43 0.11 0.43 0.11
0.33 0.25 0.30 0.25 0.25 0.34 0.24 0.34
0.13 0.44 -0.01 0.59 -0.13 0.70 -0.14 0.70
-0.30 0.85 -0.33 0.85 -0.46 0.99 -0.43 0.99
-0.65 1.19′
observed <- scan(text=gsub('[[:space:]]+',' ',observed),
what=list(y=double(),x=double()),
sep=’ ‘)
stagnant <- as.data.frame(observed)
LaplacesDemon
I have been playing around with LaplacesDemon. There is actually a functionLaplacesDemon.hpc which can use multiple cores. However, on my computer it seems more efficient just to use mclapply() from the parallel package and give the result class LaplacesDemon.hpc . Having said that, I had again quite some trouble to get LaplacesDemon to work well. In the end I used a combination of two calls to LaplacesDemon. The plot below shows selected samples after the first run. Not good enough, but that I do like this way of displaying the results of chains. It should be added that the labels looked correct with all parameters. However, that gave to much output for this blog. In addition, after the second call the results looked acceptable.
Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = apply(cc1$Posterior1,
2, median), Covar = var(cc1$Posterior1), Iterations = 1e+05,
Algorithm = “RWM”)
Acceptance Rate: 0.2408
Algorithm: Random-Walk Metropolis
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
alpha beta1 beta2 cp s2
4.920676e-04 2.199525e-04 3.753738e-04 8.680339e-04 6.122862e-08
Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
All Stationary
Dbar -144.660 -144.660
pD 7.174 7.174
DIC -137.486 -137.486
Initial Values:
alpha beta1 beta2 cp s2
0.5467048515 -0.4100040451 -1.0194586232 0.0166405998 0.0004800931
Iterations: 4e+05
Log(Marginal Likelihood): 56.92606
Minutes of run-time: 1.32
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 5
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 0
Recommended Burn-In of Un-thinned Samples: 0
Recommended Thinning: 5
Specs: (NOT SHOWN HERE)
Status is displayed every 100 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 40000
Thinning: 1
Summary of All Samples
Mean SD MCSE ESS LB
alpha 5.348239e-01 0.0244216567 2.999100e-04 11442.06 4.895229e-01
beta1 -4.196180e-01 0.0142422533 1.661658e-04 12726.60 -4.469688e-01
beta2 -1.013882e+00 0.0164892337 1.681833e-04 15191.59 -1.046349e+00
cp 2.855852e-02 0.0308177765 3.649332e-04 11945.66 -3.406306e-02
s2 4.472644e-04 0.0001429674 1.383748e-06 16571.94 2.474185e-04
Deviance -1.446602e+02 3.7879060637 4.940488e-02 10134.91 -1.496950e+02
LP 4.636511e+01 1.8939530321 2.470244e-02 10134.91 4.164313e+01
Median UB
alpha 0.53339024 5.842152e-01
beta1 -0.41996859 -3.903572e-01
beta2 -1.01387256 -9.815650e-01
cp 0.03110570 8.398674e-02
s2 0.00042101 8.006666e-04
Deviance -145.46896682 -1.352162e+02
LP 46.76949458 4.888251e+01
Summary of Stationary Samples
Mean SD MCSE ESS LB
alpha 5.348239e-01 0.0244216567 2.999100e-04 11442.06 4.895229e-01
beta1 -4.196180e-01 0.0142422533 1.661658e-04 12726.60 -4.469688e-01
beta2 -1.013882e+00 0.0164892337 1.681833e-04 15191.59 -1.046349e+00
cp 2.855852e-02 0.0308177765 3.649332e-04 11945.66 -3.406306e-02
s2 4.472644e-04 0.0001429674 1.383748e-06 16571.94 2.474185e-04
Deviance -1.446602e+02 3.7879060637 4.940488e-02 10134.91 -1.496950e+02
LP 4.636511e+01 1.8939530321 2.470244e-02 10134.91 4.164313e+01
Median UB
alpha 0.53339024 5.842152e-01
beta1 -0.41996859 -3.903572e-01
beta2 -1.01387256 -9.815650e-01
cp 0.03110570 8.398674e-02
s2 0.00042101 8.006666e-04
Deviance -145.46896682 -1.352162e+02
LP 46.76949458 4.888251e+01
STAN
Stan did not give much problems for this analysis.Inference for Stan model: smodel.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
Beta[1] -0.42 0.00 0.01 -0.45 -0.43 -0.42 -0.41 -0.39 1017 1.00
Beta[2] -1.01 0.00 0.02 -1.05 -1.02 -1.01 -1.00 -0.98 1032 1.00
Alpha 0.54 0.00 0.03 0.49 0.52 0.53 0.55 0.59 680 1.00
s2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1361 1.00
cp 0.03 0.00 0.03 -0.04 0.00 0.03 0.05 0.09 636 1.00
lp__ 90.63 0.06 1.78 86.17 89.71 91.00 91.91 93.03 935 1.01
Samples were drawn using NUTS(diag_e) at Fri Jun 19 21:17:54 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
JAGS
Again no problems for Jags.Inference for Bugs model at “/tmp/Rtmpy4a6C5/modeld4f6e9c9055.txt”, fit using jags,
4 chains, each with 10000 iterations (first 5000 discarded), n.thin = 5
n.sims = 4000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
alpha 0.534 0.027 0.479 0.518 0.533 0.552 0.586 1.040
beta[1] -0.420 0.015 -0.450 -0.429 -0.420 -0.410 -0.389 1.013
beta[2] -1.014 0.017 -1.049 -1.024 -1.014 -1.003 -0.980 1.023
cp 0.029 0.035 -0.038 0.006 0.032 0.051 0.100 1.037
s2 0.000 0.000 0.000 0.000 0.000 0.001 0.001 1.004
deviance -144.501 3.986 -149.634 -147.378 -145.432 -142.584 -134.378 1.021
n.eff
alpha 160
beta[1] 380
beta[2] 290
cp 160
s2 710
deviance 290
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 7.9 and DIC = -136.6
DIC is an estimate of expected predictive error (lower deviance is better).
CODE used
# http://support.sas.com/documentation/cdl/en/statug/67523/HTML/default/viewer.htm#statug_mcmc_examples18.htm# Example 61.12 Change Point Models
##############
#Data
##############
observed <-
‘1.12 -1.39 1.12 -1.39 0.99 -1.08 1.03 -1.08
0.92 -0.94 0.90 -0.80 0.81 -0.63 0.83 -0.63
0.65 -0.25 0.67 -0.25 0.60 -0.12 0.59 -0.12
0.51 0.01 0.44 0.11 0.43 0.11 0.43 0.11
0.33 0.25 0.30 0.25 0.25 0.34 0.24 0.34
0.13 0.44 -0.01 0.59 -0.13 0.70 -0.14 0.70
-0.30 0.85 -0.33 0.85 -0.46 0.99 -0.43 0.99
-0.65 1.19′
observed <- scan(text=gsub('[[:space:]]+',' ',observed),
what=list(y=double(),x=double()),
sep=’ ‘)
stagnant <- as.data.frame(observed)
#plot(y~x,data=stagnant)
##############
#LaplacesDemon
##############
library(‘LaplacesDemon’)
library(parallel)
mon.names <- "LP"
parm.names <- c('alpha',paste('beta',1:2,sep=''),'cp','s2')
PGF <- function(Data) {
x <-c(rnorm(5,0,1))
x[4] <- runif(1,-1.3,1.1)
x[5] <- runif(1,0,2)
x
}
MyData <- list(mon.names=mon.names,
parm.names=parm.names,
PGF=PGF,
x=stagnant$x,
y=stagnant$y)
#N<-1
Model <- function(parm, Data)
{
alpha=parm[1]
beta=parm[2:3]
cp=parm[4]
s2=parm[5]
yhat <- alpha+(Data$x-cp)*beta[1+(Data$x>=cp)]
LL <- sum(dnorm(Data$y,yhat,sd=sqrt(s2),log=TRUE))
prior <- sum(dnorm(parm[1:3],0,1e3,log=TRUE))+
dunif(cp,-1.3,1.1,log=TRUE)+
dunif(s2,0,5,log=TRUE)
LP=LL+prior
Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
yhat=yhat,
parm=parm)
return(Modelout)
}
Fit1 <- mclapply(1:4,function(i)
LaplacesDemon(Model,
Data=MyData,
Iterations=100000,
Algorithm=’RWM’,
Covar=c(rep(.01,4),.00001),
Initial.Values = c(.5,-.4,-1,.05,.001)) #Initial.Values
)
class(Fit1) <- 'demonoid.hpc'
plot(Fit1,Data=MyData,Parms=c(‘alpha’))
cc1 <- Combine(Fit1,MyData)
#
Fit2 <- mclapply(1:4,function(i)
LaplacesDemon(Model,
Data=MyData,
Iterations=100000,
Algorithm=’RWM’,
Covar=var(cc1$Posterior1),
Initial.Values = apply(cc1$Posterior1,2,median)) #Initial.Values
)
class(Fit2) <- 'demonoid.hpc'
#plot(Fit2,Data=MyData,Parms=c(‘alpha’))
cc2 <- Combine(Fit2,MyData)
cc2
##############
#STAN
##############
stanData <- list(
N=nrow(stagnant),
x=stagnant$x,
y=stagnant$y)
library(rstan)
smodel <- '
data {
int
vector[N] x;
vector[N] y;
}
parameters {
real Beta[2];
real Alpha;
real
real
}
transformed parameters {
vector[N] yhat;
for (i in 1:N)
yhat[i] <- Alpha+(x[i]-cp)*Beta[1+(x[i]>cp)];
}
model {
y ~ normal(yhat,sqrt(s2));
s2 ~ uniform(0,1e3);
cp ~ uniform(-1.3,1.1);
Alpha ~ normal(0,1000);
Beta ~ normal(0,1000);
}
‘
fstan <- stan(model_code = smodel,
data = stanData,
pars=c(‘Beta’,’Alpha’,’s2′,’cp’))
fstan
##############
#Jags
##############
library(R2jags)jagsdata <- list(
N=nrow(stagnant),
x=stagnant$x,
y=stagnant$y)
jagsm <- function() {
for(i in 1:N) {
yhat[i] <- alpha+(x[i]-cp)*beta[1+(x[i]>cp)]
y[i] ~ dnorm(yhat[i],tau)
}
tau <- 1/s2
s2 ~ dunif(0,1e3)
cp ~ dunif(-1.3,1.1)
alpha ~ dnorm(0,0.001)
beta[1] ~ dnorm(0,0.001)
beta[2] ~ dnorm(0,0.001)
}
params <- c('alpha','beta','s2','cp')
myj <-jags(model=jagsm,
data = jagsdata,
n.chains=4,
n.iter=10000,
parameters=params,
)
myj
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.