[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In my previous post I ignored the fact that some data are below the detection level. I would not know how to handle those in a mixed model from lme4 or nlme. However, JAGS can handle these values. Next to that I kept the usual independent variables, such as random effects for location, machine etc.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Data
Data has been acquired and processed similar to last week. For script at the bottom of the page.
detection level
In the spreadsheet a ‘4’ is used to indicate a value below detection limit. (* een ” 4 “-teken in kolom C geeft aan dat de analyseresultaten beneden de detectielimiet liggen). There is something strange there. The detection limit seems to be 0.4, but there is data under 0.4 which is not flagged. In addition, some of the data is below 0, which seems improbable. In the end I flagged all data under 0.4 as NA.
Fe <- rf2[rf2$Compound==levels(rf2$Compound)[9],]
library(lattice)
densityplot( ~ Amount | machine + Status,
groups=Location,data=Fe[!is.na(Fe$Status)
& Fe$Amount<.8,],
panel=function(…) {
panel.densityplot(…)
panel.abline(v=c(0,0.4))
},adjust=.5
)
Analysis
It is a reasonable standard JAGS model. The only tricky thing is initialization. When the parameters are too far away an error is thrown. Hence ‘intercept’ is initialized as -1. In text the model and inits are given, at the end the full code.
FeModel <- function() {
for (i in 1:nobs) {
BLZ[i] ~ dinterval( LAmount[i] , -0.4034029 )
ExpLAmount[i] <- intercept +
rate*time[i] +
rateL[Location[i]]*time[i]+
FMach[Machine[i]] +
FLoc[Location[i]]
LAmount[i] ~ dnorm(ExpLAmount[i],tau[Machine[i]])
}
for (i in 1:2) {
tau[i] <- pow(sd[i],-2)
sd[i] ~ dunif(0,1000)
}
intercept ~ dnorm(0,.0001)
FMach[1] ~dnorm(0,tauMachShift)
FMach[2] <- -FMach[1]
tauMachShift <- pow(sdMachShift,-2)
sdMachShift ~ dunif(0,10)
for (iL in 1:nloc) {
FLoc[iL] ~ dnorm(0,tauLoc)
rateL[iL] ~dnorm(0,tauRate)
yearlyL[iL] <- pow(pow(10,rateL[iL]+rate),365)
}
tauLoc <- pow(sdLoc,-2)
sdLoc ~ dunif(0,100)
tauRate <- pow(sdRate,-2)
sdRate ~ dunif(0,100)
rate ~ dnorm(0,.0001)
dailydec <- pow(10,rate)
yearlydec <- pow(dailydec,365)
}
inits <- function() {
list(intercept = -1,#rnorm(1,.45,.04),
rate=rnorm(0,.01),
sd=runif(2,0,50),
sdMachShift=runif(1,0,1),
sdLoc=runif(1,0,1))
}
Results
The model shows the decrease of 5% per year which was expected.
Machine 1 (Eigenbrodt NSA 181 KHT) is slightly less accurate (was also observed last week) and has slightly lower values than 2 (Van Essen/ECN vanger). Variation between locations is less than variation between machines.
Inference for Bugs model at “C:/…/AppData/Local/Temp/Rtmpma5UEx/modelc40761211ba.txt”, fit using jags,
4 chains, each with 5000 iterations (first 2500 discarded), n.thin = 2
n.sims = 5000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5%
FLoc[1] 0.023 0.070 -0.110 -0.022 0.021 0.065 0.169
FLoc[2] 0.139 0.081 -0.038 0.091 0.147 0.195 0.280
FLoc[3] -0.041 0.069 -0.180 -0.083 -0.041 0.002 0.098
FLoc[4] -0.103 0.080 -0.289 -0.147 -0.094 -0.048 0.032
FLoc[5] -0.072 0.072 -0.209 -0.117 -0.075 -0.027 0.078
FLoc[6] -0.036 0.070 -0.173 -0.079 -0.036 0.008 0.102
FLoc[7] -0.016 0.069 -0.154 -0.058 -0.016 0.027 0.117
FLoc[8] 0.278 0.079 0.127 0.229 0.274 0.323 0.453
FLoc[9] -0.061 0.089 -0.243 -0.118 -0.060 0.000 0.109
FLoc[10] 0.052 0.069 -0.091 0.009 0.052 0.095 0.186
FLoc[11] 0.040 0.079 -0.131 -0.008 0.048 0.096 0.172
FLoc[12] -0.102 0.077 -0.245 -0.153 -0.106 -0.054 0.064
FLoc[13] -0.054 0.082 -0.210 -0.107 -0.056 -0.005 0.117
FLoc[14] 0.027 0.067 -0.112 -0.013 0.028 0.070 0.155
FLoc[15] -0.031 0.070 -0.164 -0.076 -0.033 0.011 0.116
FLoc[16] 0.088 0.070 -0.050 0.045 0.087 0.131 0.227
FLoc[17] -0.154 0.076 -0.293 -0.203 -0.157 -0.106 0.006
FMach[1] -0.053 0.017 -0.087 -0.065 -0.053 -0.041 -0.018
FMach[2] 0.053 0.017 0.018 0.041 0.053 0.065 0.087
intercept 0.401 0.081 0.240 0.345 0.402 0.455 0.559
sd[1] 0.436 0.018 0.402 0.423 0.436 0.448 0.471
sd[2] 0.386 0.007 0.372 0.381 0.386 0.391 0.401
sdLoc 0.135 0.036 0.077 0.110 0.130 0.154 0.216
sdMachShift 1.854 2.456 0.033 0.170 0.675 2.587 8.820
yearlyL[1] 0.944 0.007 0.931 0.940 0.944 0.948 0.957
yearlyL[2] 0.950 0.007 0.938 0.945 0.949 0.954 0.965
yearlyL[3] 0.945 0.006 0.932 0.941 0.945 0.949 0.958
yearlyL[4] 0.949 0.007 0.936 0.944 0.948 0.953 0.963
yearlyL[5] 0.943 0.007 0.929 0.939 0.944 0.948 0.956
yearlyL[6] 0.945 0.006 0.932 0.941 0.945 0.949 0.957
yearlyL[7] 0.946 0.006 0.933 0.941 0.946 0.950 0.958
yearlyL[8] 0.945 0.006 0.932 0.941 0.945 0.949 0.957
yearlyL[9] 0.944 0.007 0.929 0.940 0.945 0.949 0.958
yearlyL[10] 0.946 0.006 0.933 0.942 0.946 0.950 0.958
yearlyL[11] 0.950 0.006 0.938 0.946 0.950 0.954 0.964
yearlyL[12] 0.942 0.007 0.928 0.938 0.943 0.947 0.955
yearlyL[13] 0.944 0.007 0.930 0.939 0.944 0.948 0.956
yearlyL[14] 0.947 0.006 0.935 0.943 0.947 0.951 0.959
yearlyL[15] 0.944 0.006 0.931 0.940 0.944 0.948 0.956
yearlyL[16] 0.946 0.006 0.934 0.942 0.946 0.950 0.958
yearlyL[17] 0.942 0.007 0.926 0.938 0.943 0.947 0.956
yearlydec 0.945 0.005 0.936 0.942 0.945 0.949 0.955
deviance 1729.603 52.705 1627.020 1693.551 1728.751 1765.143 1831.422
Rhat n.eff
FLoc[1] 1.002 2200
FLoc[2] 1.001 4000
FLoc[3] 1.001 5000
FLoc[4] 1.001 5000
FLoc[5] 1.001 5000
FLoc[6] 1.002 1500
FLoc[7] 1.001 5000
FLoc[8] 1.001 4400
FLoc[9] 1.002 1600
FLoc[10] 1.001 5000
FLoc[11] 1.001 2800
FLoc[12] 1.002 2200
FLoc[13] 1.001 5000
FLoc[14] 1.001 5000
FLoc[15] 1.001 2800
FLoc[16] 1.001 4800
FLoc[17] 1.004 740
FMach[1] 1.004 760
FMach[2] 1.004 760
intercept 1.001 3100
sd[1] 1.007 380
sd[2] 1.002 1500
sdLoc 1.003 1600
sdMachShift 1.001 5000
yearlyL[1] 1.001 3000
yearlyL[2] 1.001 4100
yearlyL[3] 1.002 1600
yearlyL[4] 1.001 2800
yearlyL[5] 1.002 1400
yearlyL[6] 1.002 2300
yearlyL[7] 1.001 5000
yearlyL[8] 1.001 5000
yearlyL[9] 1.001 4000
yearlyL[10] 1.001 3500
yearlyL[11] 1.001 5000
yearlyL[12] 1.002 1400
yearlyL[13] 1.001 5000
yearlyL[14] 1.002 1400
yearlyL[15] 1.001 3100
yearlyL[16] 1.002 1500
yearlyL[17] 1.004 850
yearlydec 1.002 1800
deviance 1.003 900
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 1385.1 and DIC = 3114.7
DIC is an estimate of expected predictive error (lower deviance is better).
Code
library(R2jags)
# http://www.lml.rivm.nl/data/gevalideerd/data/lmre_1992-2005.xls
# https://www.r-bloggers.com/read-excel-files-from-r/
library(XLConnect)
readsheet <- function(wb,cursheet) {
df = readWorksheet(wb, sheet = cursheet , header = TRUE)
topline <- 8
test <- which(df[6,]==’C’)+1
numin <- df[topline:nrow(df),test]
names(numin) <- make.names( gsub(‘ ‘,”,df[6,test]))
for(i in 1:ncol(numin)) numin[,i] <- as.numeric(gsub(‘,’,’.’,numin[,i]))
status = df[topline:nrow(df),test-1]
names(status) <- paste(‘C’,make.names( gsub(‘ ‘,”,df[6,test])),sep=”)
numin$StartDate <- as.Date(substr(df[topline:nrow(df),1],1,10))
numin$EndDate <- as.Date(substr(df[topline:nrow(df),2],1,10))
numin <- cbind(numin,status)
tall <- reshape(numin,
varying=list(make.names( gsub(‘ ‘,”,df[6,test])),paste(‘C’,make.names( gsub(‘ ‘,”,df[6,test])),sep=”)),
v.names=c(‘Amount’,’Status’),
timevar=’Compound’,
idvar=c(‘StartDate’,’EndDate’),
times=paste(df[6,test],'(‘,df[5,test],’)’,sep=”),
direction=’long’)
tall$Compound <- factor(gsub(‘ )’,’)’,gsub(‘ +’,’ ‘,tall$Compound)))
row.names(tall)<- NULL
tall$Location <- cursheet
tall
}
readfile <- function(fname) {
wb <- loadWorkbook(fname)
print(wb)
sheets <- getSheets(wb)[-1]
tt <- lapply(sheets,readsheet,wb=wb)
tt2 <- do.call(rbind,tt)
tt2$Location <- factor(tt2$Location)
tt2$fname <- fname
tt2
}
fnames <- dir(pattern=’*.xls’) #.\\. ?
rf <- lapply(fnames,readfile)
rf2 <- do.call(rbind,rf)
rf2$machine <- factor(ifelse(rf2$fname==”lmre_1992-2005.xls”,
‘Van Essen/ECN vanger’,
‘ Eigenbrodt NSA 181 KHT’))
Fe <- rf2[rf2$Compound==levels(rf2$Compound)[9],]
library(lattice)
densityplot( ~ Amount | machine + Status,
groups=Location,data=Fe[!is.na(Fe$Status)
& Fe$Amount<.8,],
panel=function(…) {
panel.densityplot(…)
panel.abline(v=c(0,0.4))
},adjust=.5
)
Fe$LAmount <- log10(Fe$Amount)
Fe$LAmount[Fe$Amount<0.4] <- NA
Fe2 <- Fe[!is.na(Fe$Status),]
datain <- list(
LAmount = Fe2$LAmount,
BLZ = !is.na(Fe2$LAmount),
Machine=(Fe2$machine==’Van Essen/ECN vanger’)+1,
Location=(1:nlevels(Fe2$Location))[Fe2$Location],
time=as.numeric(Fe2$StartDate),
nobs =nrow(Fe2),
nloc=nlevels(Fe2$Location)
)
FeModel <- function() {
for (i in 1:nobs) {
BLZ[i] ~ dinterval( LAmount[i] , -0.4034029 )
ExpLAmount[i] <- intercept +
rate*time[i] +
rateL[Location[i]]*time[i]+
FMach[Machine[i]] +
FLoc[Location[i]]
LAmount[i] ~ dnorm(ExpLAmount[i],tau[Machine[i]])
}
for (i in 1:2) {
tau[i] <- pow(sd[i],-2)
sd[i] ~ dunif(0,1000)
}
intercept ~ dnorm(0,.0001)
FMach[1] ~dnorm(0,tauMachShift)
FMach[2] <- -FMach[1]
tauMachShift <- pow(sdMachShift,-2)
sdMachShift ~ dunif(0,10)
for (iL in 1:nloc) {
FLoc[iL] ~ dnorm(0,tauLoc)
rateL[iL] ~dnorm(0,tauRate)
yearlyL[iL] <- pow(pow(10,rateL[iL]+rate),365)
}
tauLoc <- pow(sdLoc,-2)
sdLoc ~ dunif(0,100)
tauRate <- pow(sdRate,-2)
sdRate ~ dunif(0,100)
rate ~ dnorm(0,.0001)
dailydec <- pow(10,rate)
yearlydec <- pow(dailydec,365)
}
parameters <- c(‘intercept’,’yearlydec’,’sd’,
‘sdMachShift’,’FMach’,
‘sdLoc’,’FLoc’,
‘yearlyL’ )
inits <- function() {
list(intercept = -1,#rnorm(1,.45,.04),
rate=rnorm(0,.01),
sd=runif(2,0,50),
sdMachShift=runif(1,0,1),
sdLoc=runif(1,0,1))
}
# estimate
jj <- jags(FeModel, data=datain,
parameters=parameters, init=inits,n.iter=5000,
progress.bar=”gui”,n.chains=4)
jj
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.