Mixed models; Random Coefficients, part 2
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continuing from random coefficients part 1, it is time for the second part. To quote the SAS/STAT manual ‘a random coefficients model with error terms that follow a nested structure‘. The additional random variable is monthc, which is a factor containing the months and nested under batch. Hence there is one additional statement in generating the dataWant to share your content on R-bloggers? click here if you have a blog, or here if you don't.
rc2$Monthc <- factor(rc2$Month)
lme4
Running the analysis in lme4 is not difficult. However, it is noticed the random effect for month:batch provides an estimation problem. Somewhere between the month fixed effect and monthc:batch random effect lme4 has no room left for the month:batch effect. However, it is still needed later on. During simulation full against restricted model the restricted model lacks the month fixed effect and hence the random month:batch effect is not equal to 0. Hence this term has influence on significance of the fixed term.
lmer1 <- lmer(Y ~ Month +(Month-1|Batch) + (1|Batch:Monthc),data=rc2)
summary(lmer1)
Linear mixed model fit by REML
Formula: Y ~ Month + (Month – 1 | Batch) + (1 | Batch:Monthc)
Data: rc2
AIC BIC logLik deviance REMLdev
287.3 299.4 -138.6 275.2 277.3
Random effects:
Groups Name Variance Std.Dev.
Batch:Monthc (Intercept) 4.1669e+00 2.0413e+00
Batch Month 1.1006e-11 3.3175e-06
Residual 7.9670e-01 8.9258e-01
Number of obs: 84, groups: Batch:Monthc, 18; Batch, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 102.5558 0.7674 133.64
Month -0.5002 0.1140 -4.39
Correlation of Fixed Effects:
(Intr)
Month -0.768
Detail on random effects
It is almost the same solution as PROC MIXED when compensated for continuous month effect. For instance, the last number (0.885355535) is close to 12*0.07600+0.002293= 0.914293 from PROC MIXED.
ranef(lmer1)
$`Batch:Monthc`
(Intercept)
1:0 0.220538869
1:1 -2.582118495
1:3 -2.319238635
1:6 1.880673690
1:9 -1.244284881
1:12 0.772296027
2:0 -0.005588532
2:1 -2.295803984
2:3 2.905996019
2:6 1.642080584
2:9 -2.103224341
2:12 -3.330296971
3:0 1.964950249
3:1 -0.816514780
3:3 0.520042537
3:6 1.236464739
3:9 2.668672370
3:12 0.885355535
$Batch
Month
1 -2.779928e-11
2 -6.342200e-09
3 6.369999e-09
Fixed effects
Fixed effects are easy after all these exercises. Again we see a discrepancy between anova and simulation, with simulation (p=0.033) making the month effect just significant and anova making it highly significant.
lmer1b <- lmer(Y ~ 1 +(Month-1|Batch)+ (1|Batch:Monthc) ,data=rc2)
anova(lmer1,lmer1b)
Data: rc2
Models:
lmer1b: Y ~ 1 + (Month – 1 | Batch) + (1 | Batch:Monthc)
lmer1: Y ~ Month + (Month – 1 | Batch) + (1 | Batch:Monthc)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1b 4 290.91 300.63 -141.45
lmer1 5 285.18 297.33 -137.59 7.7251 1 0.005446 **
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
pboot <- function(m0,m1) {
s <- simulate(m0)
L0 <- logLik(refit(m0,s))
L1 <- logLik(refit(m1,s))
2*(L1-L0)
}
obsdev <- 2*( logLik(lmer1)-logLik(lmer1b))
#
set.seed(1001)
reps <- replicate(1000,pboot(lmer1b,lmer1))
mean(reps>obsdev)
[1] 0.033
nlme
It is not obvious how to run this model in nlme. It appears (r-help post) that one can make a groupedData object. The random effect in this object gets used in the model, even though not explicitly in the model call. Then there is some trickery in using pdBlocked. However, the output is very straightforward. The month effect is highly significant.
rc3 <- groupedData(formula=Y ~ Month | Batch,
data=rc2)
lme1 <- lme(Y ~ Month,
random= pdBlocked(list(pdIdent(~ Monthc – 1),pdIdent(~ Month – 1))),
data=rc3)
summary(lme1)
Linear mixed-effects model fit by REML
Data: rc3
AIC BIC logLik
286.901 298.9346 -138.4505
Random effects:
Composite Structure: Blocked
Block 1: Monthc0, Monthc1, Monthc3, Monthc6, Monthc9, Monthc12
Formula: ~Monthc – 1 | Batch
Structure: Multiple of an Identity
Monthc0 Monthc1 Monthc3 Monthc6 Monthc9 Monthc12
StdDev: 1.934191 1.934191 1.934191 1.934191 1.934191 1.934191
Block 2: Month
Formula: ~Month – 1 | Batch
Month Residual
StdDev: 0.111472 0.8926711
Fixed effects: Y ~ Month
Value Std.Error DF t-value p-value
(Intercept) 102.55643 0.7287180 80 140.73542 0e+00
Month -0.50034 0.1259348 80 -3.97302 2e-04
Correlation:
(Intr)
Month -0.66
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.0466072 -0.6347078 0.0557648 0.4304314 2.4608117
Number of Observations: 84
Number of Groups: 3
c(1.934191,0.111472, 0.8926711)^2
[1] 3.74109482 0.01242601 0.79686169
Details on Random effects
Exactly the same as in PROC MIXED.
ranef(lme1)
Monthc0 Monthc1 Monthc3 Monthc6 Monthc9 Monthc12
1 0.219126119 -2.5690021 -2.3067185 1.8726057 -1.235035 0.773610734
2 -0.006207775 -2.2125547 3.1063194 2.0649346 -1.444999 -2.440480024
3 1.957416153 -0.8849589 0.3005989 0.7971893 2.005861 0.002293605
Month
1 -0.0002840204
2 -0.0757124467
3 0.0759964671
MCMCglmm
For once this seems most easy. Notice that monthc is not used in the model. Absence of us() around month is sufficient for MCMCglm to interpret it as levels rather than continuous. By now I am not surprised by a mismatch between random effects between PROC MIXED and MCMCglmm. The month effect is significant.
library(MCMCglmm)
MCMCglmm1 <- MCMCglmm(Y ~ Month,
random= ~ us(Month):Batch + Month:Batch, # two terms
pr=TRUE,
data=rc2,family=’gaussian’,
nitt=1000000,thin=20,burnin=100000,
verbose=FALSE)
summary(MCMCglmm1)
Iterations = 100001:999981
Thinning interval = 20
Sample size = 45000
DIC: 238.8215
G-structure: ~us(Month):Batch
post.mean l-95% CI u-95% CI eff.samp
Month:Month.Batch 0.02221 1.386e-17 0.03053 45000
~Month:Batch
post.mean l-95% CI u-95% CI eff.samp
Month:Batch 4.695 1.741 8.649 45000
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 0.8223 0.5518 1.114 45000
Location effects: Y ~ Month
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) 102.5577 100.9234 104.1365 45961 < 2e-05 ***
Month -0.4994 -0.7563 -0.2401 45000 0.00467 **
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Details on random effects.
Again a small calculation for the last item shows 0.737419049+12*0.011875820= 0.8799289 against 0.91 and 0.88 for PROX MIXED and lme4 respectively. The random effects are quire close.
colMeans(MCMCglmm1$Sol)
(Intercept) Month Batch.Month.Batch.1 Batch.Month.Batch.2
102.557650677 -0.499353562 -0.000757751 -0.013238892
Batch.Month.Batch.3 Month:Batch.0.1 Month:Batch.1.1 Month:Batch.3.1
0.011875820 0.218504027 -2.577571780 -2.317749883
Month:Batch.6.1 Month:Batch.9.1 Month:Batch.12.1 Month:Batch.0.2
1.873068384 -1.245499371 0.768104528 -0.008015210
Month:Batch.1.2 Month:Batch.3.2 Month:Batch.6.2 Month:Batch.9.2
-2.278469327 2.929929878 1.705485501 -1.995548530
Month:Batch.12.2 Month:Batch.0.3 Month:Batch.1.3 Month:Batch.3.3
-3.183017853 1.960380450 -0.828147383 0.483852028
Month:Batch.6.3 Month:Batch.9.3 Month:Batch.12.3
1.158358351 2.551876100 0.737419049
Difference in fixed effects between all functions
All functions now make the fixed month effect at approximately -0.5. All have it significant at the conventional 5% level. There are still quite some differences. PROC MIXED and lme4 simulations have p-values close to 5%. Lme4 anova and MCMCglmm have the p-value in the 0.005 region. nlme has 0.0002. This is worrying; the moment I am performing such calculations with real data I need to be more certain on my p-values than implied here.
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.