Mixed models; Random Coefficients, part 1
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continuing with my exploration of mixed models I am now at the first part of random coefficients: example 59.5 for proc mixed (page 5034 of the SAS/STAT 12.3 Manual). This means I skipped examples 59.3 (plotting the likelihood) and 59.4 (known G and R). The latter I might want to do later, though I find this to be quite a strong prior.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Data
To quote the SAS/STAT manual ‘The observed responses are replicate assay results, expressed in percent of label claim, at various shelf ages, expressed in months. The desired mixed model involves three batches of product that differ randomly in intercept (initial potency) and slope (degradation rate).‘
rc1 <- read.table(textConnection('
1 0 101.2 103.3 103.3 102.1 104.4 102.4
1 1 98.8 99.4 99.7 99.5 . .
1 3 98.4 99.0 97.3 99.8 . .
1 6 101.5 100.2 101.7 102.7 . .
1 9 96.3 97.2 97.2 96.3 . .
1 12 97.3 97.9 96.8 97.7 97.7 96.7
2 0 102.6 102.7 102.4 102.1 102.9 102.6
2 1 99.1 99.0 99.9 100.6 . .
2 3 105.7 103.3 103.4 104.0 . .
2 6 101.3 101.5 100.9 101.4 . .
2 9 94.1 96.5 97.2 95.6 . .
2 12 93.1 92.8 95.4 92.2 92.2 93.0
3 0 105.1 103.9 106.1 104.1 103.7 104.6
3 1 102.2 102.0 100.8 99.8 . .
3 3 101.2 101.8 100.8 102.6 . .
3 6 101.1 102.0 100.1 100.2 . .
3 9 100.9 99.5 102.2 100.8 . .
3 12 97.8 98.3 96.9 98.4 96.9 96.5
‘),
na.strings=’.’,
col.names=c(‘Batch’,’Month’,paste(‘Y’,1:6,sep=”)))
rc2 <- reshape(rc1,
direction=’long’,
varying=list(Y=paste(‘Y’,1:6,sep=”)),
v.names=’Y’,
timevar=’i’,
times=1:6)
rc2$Batch <- factor(rc2$Batch)
rc2 <- rc2[complete.cases(rc2),]
Analysis
In this post the first model will be attempted, for nlme, lme4 and MCMCglmm. MCMCglmm was clearly the most difficult to formulate, especially with regards to the prior. It appeared the models have different results for the significance of the fixed month effect. Model quote ‘The two random effects are Int and Month, modeling random intercepts and slopes, respectively. Note that Intercept and Month are used as both fixed and random effects.‘
lme4
The basic call is not difficult at all. The summary also shows residual variance and variances for intercept and month (within batch).
library(lme4)
lmer1 <- lmer(Y ~ Month + (Month|Batch),data=rc2)
summary(lmer1)
Linear mixed model fit by REML
Formula: Y ~ Month + (Month | Batch)
Data: rc2
AIC BIC logLik deviance REMLdev
362.3 376.9 -175.2 348.4 350.3
Random effects:
Groups Name Variance Std.Dev. Corr
Batch (Intercept) 0.976805 0.98833
Month 0.037166 0.19278 -0.548
Residual 3.293234 1.81473
Number of obs: 84, groups: Batch, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 102.7038 0.6456 159.08
Month -0.5259 0.1194 -4.41
Correlation of Fixed Effects:
(Intr)
Month -0.580
VarCorr(lmer1)
$Batch
(Intercept) Month
(Intercept) 0.9768046 -0.10448120
Month -0.1044812 0.03716553
attr(,”stddev”)
(Intercept) Month
0.9883343 0.1927836
attr(,”correlation”)
(Intercept) Month
(Intercept) 1.0000000 -0.5483579
Month -0.5483579 1.0000000
attr(,”sc”)
[1] 1.814727
lmer1b <- lmer(Y ~ 1 + (Month|Batch),data=rc2)
anova(lmer1,lmer1b)
Data: rc2
Models:
lmer1b: Y ~ 1 + (Month | Batch)
lmer1: Y ~ Month + (Month | Batch)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1b 5 365.36 377.51 -177.68
lmer1 6 360.44 375.03 -174.22 6.914 1 0.008552 **
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
pboot <- function(m0,m1) {
s <- simulate(m0)
L0 <- logLik(refit(m0,s))
L1 <- logLik(refit(m1,s))
2*(L1-L0)
}
obsdev <- 2*( logLik(lmer1)-logLik(lmer1b))
#
set.seed(1001)
reps <- replicate(1000,pboot(lmer1b,lmer1))
mean(reps>obsdev)
[1] 0.055
Random effects are same:ranef(lmer1)
$Batch
(Intercept) Month
1 -1.0010539 0.12865763
2 0.3934426 -0.20598652
3 0.6076112 0.07732889
nlme
library(nlme)
lme1 <- lme(Y ~ Month,
random= ~Month|Batch,
data=rc2)
summary(lme1)
Linear mixed-effects model fit by REML
Data: rc2
AIC BIC logLik
362.3281 376.7685 -175.1641
Random effects:
Formula: ~Month | Batch
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 0.9883331 (Intr)
Month 0.1927833 -0.548
Residual 1.8147270
Fixed effects: Y ~ Month
Value Std.Error DF t-value p-value
(Intercept) 102.70380 0.6456110 80 159.08001 0
Month -0.52594 0.1193732 80 -4.40589 0
Correlation:
(Intr)
Month -0.58
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.85442335 -0.68272916 -0.09994707 0.62635493 2.64425526
Number of Observations: 84
Number of Groups: 3
#variances
c(.9883331,.1927844,1.814727)^2
[1] 0.97680232 0.03716582 3.29323408
#covariance
.9883331*.1927844*-.548
[1] -0.1044133
Fixed effects are same as in PROC MIXED, but degrees of freedom are off, making the month effect significant. Random effects are exactly the same
ranef(lme1)
(Intercept) Month
1 -1.0010483 0.12868117
2 0.3934057 -0.20599206
3 0.6076426 0.07731089
MCMCglmm
MCMCglmm has me baffled for this one:. After reviewing the manual the obvious model was not so difficult to formulate. However,a prior is needed. Following MCMCglmm course notes example, page 78 I copied a prior. The standard errors are larger than PROC MIXED. In addition, month is not a significant fixed effect.
prior1 <- list(R=list(V=1e-7,nu=-2),
G=list(G1=list(V=diag(2),nu=2) )
)
MCMCglmm1 <- MCMCglmm(Y ~ Month,
random= ~ us(Month+1):Batch,
pr=TRUE,
data=rc2,family=’gaussian’,
nitt=500000,thin=20,burnin=50000,
prior=prior1,
verbose=FALSE)
summary(MCMCglmm1)
Iterations = 50001:499981
Thinning interval = 20
Sample size = 22500
DIC: 346.489
G-structure: ~us(Month + 1):Batch
post.mean l-95% CI u-95% CI eff.samp
(Intercept):(Intercept).Batch 4.3457 0.1418 11.519 22500
Month:(Intercept).Batch -0.3678 -4.8534 3.988 22500
(Intercept):Month.Batch -0.3678 -4.8534 3.988 22500
Month:Month.Batch 2.0747 0.1247 5.766 22500
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 3.474 2.456 4.658 22500
Location effects: Y ~ Month
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) 102.7132 100.5880 105.0134 22257 <4e-05 ***
Month -0.5211 -2.0054 1.0162 22500 0.356
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
colMeans(MCMCglmm1$Sol)
(Intercept) Month Batch.(Intercept).Batch.1
102.71316233 -0.52109218 -1.00404017
Batch.(Intercept).Batch.2 Batch.(Intercept).Batch.3 Batch.Month.Batch.1
0.38305996 0.60774619 0.12522274
Batch.Month.Batch.2 Batch.Month.Batch.3
-0.22493139 0.08335665
apply(MCMCglmm1$Sol,2,sd)
(Intercept) Month Batch.(Intercept).Batch.1
1.1851105 0.8264831 1.2192814
Batch.(Intercept).Batch.2 Batch.(Intercept).Batch.3 Batch.Month.Batch.1
1.2049515 1.2168346 0.8271416
Batch.Month.Batch.2 Batch.Month.Batch.3
0.8274522 0.8288230
To return to this all important prior, where experimenting showed a different choice clearly had influence, we can plot it. This shows that the variance is suggested to be close to 1 and lower than 10.
nu.ast <- prior1$G$G1$nu-dim(prior1$G$G1$V)[1]+1
V.ast <- prior1$G$G1$V[1,1]*(prior1$G$G1$nu/nu.ast)
xv=seq(1e-16,10,length.out=100)
dv=MCMCpack::dinvgamma(xv,shape=nu.ast/2,
scale=(nu.ast*V.ast)/2)
plot(dv~xv,type=’l’)
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.