Stella Copeland’s Intro to Mixed Models in R

[This article was first published on Noam Ross - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In D-RUG today, Stella Copeland gave a quick introduction to mixed models in R. Here’s the script that she presented:

setwd() #if needed
#a collection of approaches to mixed models in R
#this is intended to be a short demo on mixed model approaches
#BUT!!!! (Disclaimer)
#The examples below are very likely NOT to be the best representations of a mixed model, even for this dataset!
data<-read.csv("MixedModelExData.csv",header=T)
head(data) #What does this all mean?
#ElevCat = Elevation Category,
length(levels(data$ElevCat)) #5 elevations
#Site Number within Elevation Band
length(levels(data$SiteNumber)) #9 sites
#Microclimate.Category = Warm Cool Right #3 microclimates
levels(data$Microclimate.Category)
#PlotNumber
length(levels(data$PlotNumber)) #225 plots - 9 plots/micro/site
#Subplots = all unique, nested within elevation, site, microclimate.category,PlotNumber
length(unique(data$Subplot)) #450 plots - 2 / plot
#main point: lots of nesting!! Now - what about data?
#Neighbor removal experiment - Neighbor Removal = one subplot / plot
#CoverBeforeRemoval & Removed Biomass & Control Cover - potential covariates for the neighbor removal treatment
#Response variables - Heightcm (height of individuals), TotalLeaves (# of leaves/plant)
#Em2011Bin & Em2012Bin - Emergence in 2011 and 2012 - binary response 1 for emergence, 0 if did not emerge
#Babies - # of offspring from 2011 individuals in 2012
#Our predictors of interest are:
#Categorical: Elevation, microclimate, neighbor removal
#Continuous: Biomass or Cover
#Our responses are:
#Continuous: Height, Leaves
#Binomial: Emergence
#Count: Babies
#How do we account for random effects of site and plotnumber within site and subplot?
#One - if we have a normally distributed response variable this would be a linear mixed model that means you could use:
#aov or lmer(lme4 package)
#Say we want to know how biomass is affected by microclimate?
#Maybe it would be best to average the data by plot beforehand, but we'll ignore that for demonstration purposes:
#Also, CoverBeforeRemoval is NOT normally distributed...but for the sake of demonstration...
CoverModel<-aov(CoverBeforeRemoval~Microclimate.Category+Error(SiteNumber),data=data) #since we have microclimate occurs within site
summary(CoverModel)
#CoverModel.withPlot<-aov(CoverBeforeRemoval~Microclimate.Category*ElevCat+Error(SiteNumber/PlotNumber),data=data) #this doesn't work!
#What about the generalized linear model case?!!
#Binomial, poisson, etc...generalized linear mixed models...lme4
library(lme4)
#the structure here is different - random effects are specified in paratheses
#glmer and lmer are almost the same - lmer with a family other than guassian will fit a generalized linear model
#parantheses in lmer mean that the effect is a random effect, 1/means that the intercept varies, effect/random means the slope varies
#some examples at vignette("Implementation","lme4")
CoverModel.lmer1<-lmer(CoverBeforeRemoval~Microclimate.Category+(1|SiteNumber/PlotNumber),data=data) #This should fit a linear mixed model by default.
summary(CoverModel.lmer1)
CoverModel.lmer2<-lmer(CoverBeforeRemoval~Microclimate.Category+(1|SiteNumber),data=data)
anova(CoverModel.lmer1,CoverModel.lmer2) #we can compare models to decide on use of random effects
#Now let's work with a count variable - number of offspring - that happens to be more or less from a poisson distribution
BabiesMod1<-lmer(Babies~NeighborRemoval*Microclimate.Category+(1|SiteNumber/PlotNumber),data=data,family=poisson)#
summary(BabiesMod1)
BabiesMod2<-lmer(Babies~NeighborRemoval*Microclimate.Category+(1|SiteNumber/Microclimate.Category/PlotNumber),data=data,family=poisson)#Nest microclimate within Site
summary(BabiesMod2)
#binomial distribution works similarly...
EmergMod1<-lmer(Em2011Bin~NeighborRemoval*Microclimate.Category+(1|SiteNumber/PlotNumber),data=data,family=binomial)#
summary(EmergMod1)
#maybe we wanted to know about the effect of a continuous variable - cover - on the emergence of the control plots
EmergControl1<-lmer(Em2011Bin~Microclimate.Category+ElevCat+ControlCover+(1|SiteNumber),data=data[which(data$NeighborRemoval=="No"),],
family=binomial)#
summary(EmergControl1)
#maybe we should worry about overdispersion? there are ways to handle that (see Bolker et. al.)
rdev <- sum(residuals(BabiesMod1)^2)
mdf <- length(fixef(BabiesMod1))
rdf <- nrow(data)-mdf #residual degrees of freedom
rdev/rdf #0.313 which is less than 1...there are other ways of looking at overdispersion, but this is one simple example
#so how do we extract confidence intervals and all that good stuff?!
#Unfortunately, mcmcsamp in the lmer4 package does not work for poisson distributions (and maybe non-normal distributions in general, for now)
library(MCMCglmm)
#note that the model structure here is a bit different...random and fixed are separated out
sampBabiesMod1 = MCMCglmm(fixed=Babies~NeighborRemoval*Microclimate.Category, random=~SiteNumber,
data = data,family="poisson") #generate markov chain monte carlo (mcmc) sample for the parameters of the model, and then provides confidence intervals
library(MASS) #according to Bolker et. al. (TREE article), penalized loglikelihood may not be a great approach, but here's an example
modPQL<-glmmPQL(Babies~NeighborRemoval*Microclimate.Category,random = ~1|SiteNumber,data=data,family=poisson)
#also....
#package glmmML
library(glmmML) #maximimum likelihood approaches, fewer options for distributions (family)
?glmmML
#package glmmADMB - download here...http://glmmadmb.r-forge.r-project.org/ slow to download, and I am not sure how this compared to other packages...but promising!
library(glmmADMB) #
#Join the r mixed models listserve!
#Note that there is a LOT of discussion (in general, and on the R list) about how to calculate p-values/degrees of freedom for mixed models! Buyer beware...
#This is a useful consolatidated version of the listserve discussion topics: http://glmm.wikidot.com/faq

Get the data file for this script here

Stella also recommends this paper by Ben Bolker as a quick introduction to the topic.

To leave a comment for the author, please follow the link and comment on their blog: Noam Ross - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)