[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Two weeks ago I showed rain data from six stations in Netherlands years 1906 till now. Last week I showed that frequency of days with and without rain differed between December 1906-1915 and December 2003-2012. This week I am considering the same data as t-distributed with a left truncation at 0 mm rain. This can be modeled very easy in JAGS. This model gives the posterior distribution of the amount of rain moved 1 mm of rain upwards and half a mm narrower, but both these intervals include 0.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The data
As described before; the data are from KNMI; ROYAL NETHERLANDS METEOROLOGICAL INSTITUTE. The actual page for the data is http://www.knmi.nl/klimatologie/monv/reeksen/select_rr.html. The data selected were from 6 locations; De Bilt, Groningen, Heerde, Kerkwerve, Ter Apel and Winterswijk (Sibinkweg). The blog starts with data entered in a data.frame named all which was derived two weeks ago. The data selected is again every 5th day in January 1906-195 against January 2004-2013.
sel1 <- all[(all$year <1916 | all$year>2003)
& all$Month==’Jan’
& all$location==’DE-BILT’
& all$day %in% seq(1,31,by=5),]
sel1$decade <- factor(c(‘1906-1915′,’2004-2013’)[1+(sel1$year>1950)])
datain <- list(
isCensored = 1-as.numeric(sel1$RD==0),
N = nrow(sel1),
RD = ifelse(sel1$RD==0,NA,sel1$RD/10),
period = 1+(sel1$year>1950)
)
model1 <- function() {
for ( i in 1:N ) {
isCensored[i] ~ dinterval( RD[i] , 0 )
RD[i] ~ dt( mu[period[i]] , tau[period[i]],nu )
}
for (i in 1:2) {
tau[i] <- 1/pow(sigma[i],2)
sigma[i] ~ dlnorm(sigmamu,sigmatau)
mu[i] ~ dnorm(mumu,mutau)
}
mumu ~ dnorm(0,1e-6)
mutau <- 1/pow(musigma,2)
musigma ~ dunif(0,100)
sigmamu ~ dnorm(0,1e-6)
sigmatau <- 1/pow(sigmasigma,2)
sigmasigma ~ dunif(1,100)
mudif <- mu[2]-mu[1]
sigmadif <- sigma[2]-sigma[1]
nu <- 1/nuinv
nuinv ~ dunif(0,.5)
}
params <- c(‘mu’,’sigma’,’mudif’,’sigmadif’,’nu’)
inits <- function() {
list(mumu = rnorm(1,0),
musigma = runif(1,1,2),
sigmamu = rnorm(1,0),
sigmasigma = runif(1,1,2),
nuinv = runif(1,0,.5))
}
jagsfit <- jags(datain, model=model1, inits=inits,
parameters=params,progress.bar=”gui”,
n.chains=1,n.iter=10000,n.burnin=3000,n.thin=3)
jagsfit
The model
The model is almost off the shelf. Truncation, two t-distributions with different means and standard deviations. Priors there taken from Data Analysis Using Regression and Multilevel/Hierarchical Models, Gelman and Hill.sel1 <- all[(all$year <1916 | all$year>2003)
& all$Month==’Jan’
& all$location==’DE-BILT’
& all$day %in% seq(1,31,by=5),]
sel1$decade <- factor(c(‘1906-1915′,’2004-2013’)[1+(sel1$year>1950)])
datain <- list(
isCensored = 1-as.numeric(sel1$RD==0),
N = nrow(sel1),
RD = ifelse(sel1$RD==0,NA,sel1$RD/10),
period = 1+(sel1$year>1950)
)
model1 <- function() {
for ( i in 1:N ) {
isCensored[i] ~ dinterval( RD[i] , 0 )
RD[i] ~ dt( mu[period[i]] , tau[period[i]],nu )
}
for (i in 1:2) {
tau[i] <- 1/pow(sigma[i],2)
sigma[i] ~ dlnorm(sigmamu,sigmatau)
mu[i] ~ dnorm(mumu,mutau)
}
mumu ~ dnorm(0,1e-6)
mutau <- 1/pow(musigma,2)
musigma ~ dunif(0,100)
sigmamu ~ dnorm(0,1e-6)
sigmatau <- 1/pow(sigmasigma,2)
sigmasigma ~ dunif(1,100)
mudif <- mu[2]-mu[1]
sigmadif <- sigma[2]-sigma[1]
nu <- 1/nuinv
nuinv ~ dunif(0,.5)
}
params <- c(‘mu’,’sigma’,’mudif’,’sigmadif’,’nu’)
inits <- function() {
list(mumu = rnorm(1,0),
musigma = runif(1,1,2),
sigmamu = rnorm(1,0),
sigmasigma = runif(1,1,2),
nuinv = runif(1,0,.5))
}
jagsfit <- jags(datain, model=model1, inits=inits,
parameters=params,progress.bar=”gui”,
n.chains=1,n.iter=10000,n.burnin=3000,n.thin=3)
jagsfit
Inference for Bugs model at “C:/Users/…/Local/Temp/Rtmp4ydV3G/modele0c40226a9c.txt”, fit using jags,
1 chains, each with 10000 iterations (first 3000 discarded), n.thin = 3
n.sims = 2334 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5%
mu[1] 0.026 0.480 -0.925 -0.284 0.042 0.332 0.951
mu[2] 0.917 0.361 0.250 0.665 0.900 1.152 1.658
mudif 0.891 0.615 -0.231 0.449 0.880 1.315 2.109
nu 2.331 0.348 2.006 2.085 2.212 2.465 3.290
sigma[1] 2.980 0.597 2.018 2.555 2.913 3.327 4.361
sigma[2] 2.309 0.395 1.642 2.024 2.274 2.552 3.197
sigmadif -0.671 0.689 -2.108 -1.098 -0.622 -0.214 0.588
deviance 509.412 9.431 492.173 502.886 508.990 515.454 528.740
DIC info (using the rule, pD = var(deviance)/2)
pD = 44.5 and DIC = 553.9
DIC is an estimate of expected predictive error (lower deviance is better).
Interpretation
The means shown in the first decade is very close to 0, with an increase to 1 mm in the last decade. The difference is just 0.9 mm a very small increase, with 0 just within the posterior 95% interval.
The standard deviation decreases from first to last decade. It is almost like the rain becomes more predictable. Again the 95% posterior interval of the difference does include 0.
The degrees of freedom of the t distribution is two to three, quite heavy tailed. Hence where almost half of a the days don’t have rain, half of the days have a little rain (a few mm, certainly not more than 6 mm), a few dais have more rain.
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.