Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I like statistics and I struggle with statistics. Often times I get frustrated when I don’t understand and I really struggled to make sense of Krushke’s Bayesian analysis of a split-plot, particularly because ‘it didn’t look like’ a split-plot to me.
Additionally, I have made a few posts discussing linear mixed models using several different packages to fit them. At no point I have shown what are the calculations behind the scenes. So, I decided to combine my frustration and an explanation to myself in a couple of posts. This is number one and, eventually, there will be a follow up.
How do linear mixed models look like
Linear mixed models, models that combine so-called fixed and random effects, are often represented using matrix notation as:
where
By the way, the history of linear mixed models is strongly related to applications of quantitative genetics for the prediction of breeding values, particularly in dairy cattle. Charles Henderson developed what is now called Henderson’s Mixed Model Equations† to simultaneously estimate fixed effects and predict random genetic effects:
The big matrix on the left-hand side of this equation is often called the
Old school: physical split-plots
Given that I’m an unsophisticated forester and that I’d like to use data available to anyone, I’ll rely on an agricultural example (so plots are actually physical plots in the field) that goes back to Frank Yates. There are two factors (oats variety, with three levels, and fertilization, with four levels). Yates, F. (1935) Complex experiments, Journal of the Royal Statistical Society Suppl. 2, 181–247 (behind pay wall here).
Now it is time to roll up our sleeves and use some code, getting the data and fitting the same model using nlme
(m1) and asreml
(m2), just for the fun of it. Anyway, nlme
and asreml
produce exactly the same results.
We will use the oats
data set that comes with MASS, although there is also an Oats
data set in nlme
and another version in the asreml
package. (By the way, you can see a very good explanation by Bill Venables of a ‘traditional’ ANOVA analysis for a split-plot here):
require(MASS) # we get the oats data from here require(nlme) # for lme function require(asreml) # for asreml function. This dataset use different variable names, # which may require renaming a dataset to use the code below # Get the oats data set and check structure data(oats) head(oats) str(oats) # Create a main plot effect for clarity's sake oats$MP = oats$V # Split-plot using NLME m1 = lme(Y ~ V*N, random = ~ 1|B/MP, data = oats) summary(m1) fixef(m1) ranef(m1) # Split-plot using ASReml m2 = asreml(Y ~ V*N, random = ~ B/MP, data = oats) summary(m2)$varcomp coef(m2)$fixed coef(m2)$random fixef(m1) # (Intercept) VMarvellous VVictory N0.2cwt # 80.0000000 6.6666667 -8.5000000 18.5000000 # N0.4cwt N0.6cwt VMarvellous:N0.2cwt VVictory:N0.2cwt # 34.6666667 44.8333333 3.3333333 -0.3333333 #VMarvellous:N0.4cwt VVictory:N0.4cwt VMarvellous:N0.6cwt VVictory:N0.6cwt # -4.1666667 4.6666667 -4.6666667 2.1666667 ranef(m1) # Level: B # (Intercept) # I 25.421511 # II 2.656987 # III -6.529883 # IV -4.706019 # V -10.582914 # VI -6.259681 # # Level: MP %in% B # (Intercept) # I/Golden.rain 2.348296 # I/Marvellous -3.854348 # I/Victory 14.077467 # II/Golden.rain 4.298706 # II/Marvellous 6.209473 # II/Victory -9.194250 # III/Golden.rain -7.915950 # III/Marvellous 10.750776 # III/Victory -6.063976 # IV/Golden.rain 5.789462 # IV/Marvellous -7.115566 # IV/Victory -1.001111 # V/Golden.rain 1.116768 # V/Marvellous -9.848096 # V/Victory 3.497878 # VI/Golden.rain -5.637282 # VI/Marvellous 3.857761 # VI/Victory -1.316009
Now we can move to implement the Mixed Model Equations, where probably the only gotcha is the definition of the nlme
and asreml
use ‘number of levels of the factor’ for both the main and interactions effects, which involves using the contrasts.arg
argument in model.matrix()
.
# Variance components varB = 214.477 varMP = 106.062 varR = 177.083 # Basic vector and matrices: y, X, Z, G & R y = matrix(oats$Y, nrow = dim(oats)[1], ncol = 1) X = model.matrix(~ V*N, data = oats) Z = model.matrix(~ B/MP - 1, data = oats, contrasts.arg = list(B = contrasts(oats$B, contrasts = F), MP = contrasts(oats$MP, contrasts = F))) G = diag(c(rep(varB, 6), rep(varMP, 18))) R = diag(varR, 72, 72) Rinv = solve(R) # Components of C XpX = t(X) %*% Rinv %*% X ZpZ = t(Z) %*% Rinv %*% Z XpZ = t(X) %*% Rinv %*% Z ZpX = t(Z) %*% Rinv %*% X Xpy = t(X) %*% Rinv %*% y Zpy = t(Z) %*% Rinv %*% y # Building C * [b a] = RHS C = rbind(cbind(XpX, XpZ), cbind(ZpX, ZpZ + solve(G))) RHS = rbind(Xpy, Zpy) blup = solve(C, RHS) blup # [,1] # (Intercept) 80.0000000 # VMarvellous 6.6666667 # VVictory -8.5000000 # N0.2cwt 18.5000000 # N0.4cwt 34.6666667 # N0.6cwt 44.8333333 # VMarvellous:N0.2cwt 3.3333333 # VVictory:N0.2cwt -0.3333333 # VMarvellous:N0.4cwt -4.1666667 # VVictory:N0.4cwt 4.6666667 # VMarvellous:N0.6cwt -4.6666667 # VVictory:N0.6cwt 2.1666667 # BI 25.4215578 # BII 2.6569919 # BIII -6.5298953 # BIV -4.7060280 # BV -10.5829337 # BVI -6.2596927 # BI:MPGolden.rain 2.3482656 # BII:MPGolden.rain 4.2987082 # BIII:MPGolden.rain -7.9159514 # BIV:MPGolden.rain 5.7894753 # BV:MPGolden.rain 1.1167834 # BVI:MPGolden.rain -5.6372811 # BI:MPMarvellous -3.8543865 # BII:MPMarvellous 6.2094778 # BIII:MPMarvellous 10.7507978 # BIV:MPMarvellous -7.1155687 # BV:MPMarvellous -9.8480945 # BVI:MPMarvellous 3.8577740 # BI:MPVictory 14.0774514 # BII:MPVictory -9.1942649 # BIII:MPVictory -6.0639747 # BIV:MPVictory -1.0011059 # BV:MPVictory 3.4978963 # BVI:MPVictory -1.3160021
Not surprisingly, we get the same results, except that we start assuming the variance components from the previous analyses, so we can avoid implementing the code for restricted maximum likelihood estimation as well. By the way, given that
XpXnoR = t(X) %*% X XpXnoR # (Intercept) VMarvellous VVictory N0.2cwt N0.4cwt N0.6cwt #(Intercept) 72 24 24 18 18 18 #VMarvellous 24 24 0 6 6 6 #VVictory 24 0 24 6 6 6 #N0.2cwt 18 6 6 18 0 0 #N0.4cwt 18 6 6 0 18 0 #N0.6cwt 18 6 6 0 0 18 #VMarvellous:N0.2cwt 6 6 0 6 0 0 #VVictory:N0.2cwt 6 0 6 6 0 0 #VMarvellous:N0.4cwt 6 6 0 0 6 0 #VVictory:N0.4cwt 6 0 6 0 6 0 #VMarvellous:N0.6cwt 6 6 0 0 0 6 #VVictory:N0.6cwt 6 0 6 0 0 6 # VMarvellous:N0.2cwt VVictory:N0.2cwt VMarvellous:N0.4cwt #(Intercept) 6 6 6 #VMarvellous 6 0 6 #VVictory 0 6 0 #N0.2cwt 6 6 0 #N0.4cwt 0 0 6 #N0.6cwt 0 0 0 #VMarvellous:N0.2cwt 6 0 0 #VVictory:N0.2cwt 0 6 0 #VMarvellous:N0.4cwt 0 0 6 #VVictory:N0.4cwt 0 0 0 #VMarvellous:N0.6cwt 0 0 0 #VVictory:N0.6cwt 0 0 0 # VVictory:N0.4cwt VMarvellous:N0.6cwt VVictory:N0.6cwt #(Intercept) 6 6 6 #VMarvellous 0 6 0 #VVictory 6 0 6 #N0.2cwt 0 0 0 #N0.4cwt 6 0 0 #N0.6cwt 0 6 6 #VMarvellous:N0.2cwt 0 0 0 #VVictory:N0.2cwt 0 0 0 #VMarvellous:N0.4cwt 0 0 0 #VVictory:N0.4cwt 6 0 0 #VMarvellous:N0.6cwt 0 6 0 #VVictory:N0.6cwt 0 0 6
I will leave it here and come back to this problem as soon as I can.
† Incidentally, a lot of the theoretical development was supported by Shayle Searle (a Kiwi statistician and Henderson’s colleague in Cornell University).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.