Three ways to get parameter-specific p-values from lmer
[This article was first published on Minding the Brain, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
How to get parameter-specific p-values is one of the most commonly asked questions about multilevel regression. The key issue is that the degrees of freedom are not trivial to compute for multilevel regression. Various detailed discussions can be found on the R-wiki and R-help mailing list post by Doug Bates. I have experimented with three methods that I think are reasonable.
1. Use the normal approximation. Since the t distribution converges to the z distribution as degrees of freedom increase, this is like assuming infinite degrees of freedom. This is unambiguously anti-conservative, but for reasonable sample sizes, it appears not to be very anti-conservative (Barr et al., 2013). That is, if we take the p-value to measure the probability of a false positive, this approximation produces a somewhat (but perhaps not alarmingly) higher false positive rate than the nominal 5% at p = 0.05.
Here is an example using proportions of semantic errors in picture naming by different aphasia subtypes (the data file can be found here):
2. Use the Satterthwaite approximation, which is implemented in the lmerTest package. According to the documentation, this is based on SAS proc mixed theory. The lmerTest package overloads the lmer function, so you can just re-fit the model using exactly the same code, but the summary() will now include approximate degrees of freedom and p-values. This implementation is extremely easy to use, but can be a little maddening if you forget whether your model is a an object of type lmerMod or merModLmerTest.
3. Use the Kenward-Roger approximation to get approximate degrees of freedom and the t-distribution to get p-values, which is implemented in the pbkrtest package.
Not surprisingly, the Satterthwaite and Kenward-Roger approximations produce slightly more conservative p-values than the normal approximation does. However, even in this not-very-large data set (24 participants, 6-9 in each of 3 groups, 5 observations for each participant) the normal approximation is only slightly less conservative than these two options. The approximate degrees of freedom are slightly higher when using the Satterthwaite approximation (23) than the Kenward-Roger approximation (20.1), which makes the latter a bit more conservative. Of course, these are not systematic tests, but it is fairly representative of a few models that I've compared.
1. Use the normal approximation. Since the t distribution converges to the z distribution as degrees of freedom increase, this is like assuming infinite degrees of freedom. This is unambiguously anti-conservative, but for reasonable sample sizes, it appears not to be very anti-conservative (Barr et al., 2013). That is, if we take the p-value to measure the probability of a false positive, this approximation produces a somewhat (but perhaps not alarmingly) higher false positive rate than the nominal 5% at p = 0.05.
Here is an example using proportions of semantic errors in picture naming by different aphasia subtypes (the data file can be found here):
load("Examples.RData") require(lme4) # fit the model m.sem <- lmer(Semantic.error ~ TestTime * Diagnosis + (TestTime | SubjectID), data = NamingRecovery, REML = FALSE) # extract coefficients coefs <- data.frame(coef(summary(m.sem))) # use normal distribution to approximate p-value coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value))) coefs ## Estimate Std..Error t.value p.z ## (Intercept) 0.045767 0.007776 5.8855 3.970e-09 ## TestTime -0.008685 0.003524 -2.4644 1.372e-02 ## DiagnosisConduction -0.015149 0.010039 -1.5090 1.313e-01 ## DiagnosisWernicke -0.004899 0.010287 -0.4762 6.339e-01 ## TestTime:DiagnosisConduction 0.007308 0.004550 1.6063 1.082e-01 ## TestTime:DiagnosisWernicke 0.012854 0.004662 2.7571 5.832e-03
2. Use the Satterthwaite approximation, which is implemented in the lmerTest package. According to the documentation, this is based on SAS proc mixed theory. The lmerTest package overloads the lmer function, so you can just re-fit the model using exactly the same code, but the summary() will now include approximate degrees of freedom and p-values. This implementation is extremely easy to use, but can be a little maddening if you forget whether your model is a an object of type lmerMod or merModLmerTest.
require(lmerTest) # re-fit model m.semTest <- lmer(Semantic.error ~ TestTime * Diagnosis + (TestTime | SubjectID), data = NamingRecovery, REML = FALSE) # get Satterthwaite-approximated degrees of freedom coefs$df.Satt <- coef(summary(m.semTest))[, 3] # get approximate p-values coefs$p.Satt <- coef(summary(m.semTest))[, 5] coefs ## Estimate Std..Error t.value p.z ## (Intercept) 0.045767 0.007776 5.8855 3.970e-09 ## TestTime -0.008685 0.003524 -2.4644 1.372e-02 ## DiagnosisConduction -0.015149 0.010039 -1.5090 1.313e-01 ## DiagnosisWernicke -0.004899 0.010287 -0.4762 6.339e-01 ## TestTime:DiagnosisConduction 0.007308 0.004550 1.6063 1.082e-01 ## TestTime:DiagnosisWernicke 0.012854 0.004662 2.7571 5.832e-03 ## df.Satt p.Satt ## (Intercept) 23.00 5.344e-06 ## TestTime 22.99 2.163e-02 ## DiagnosisConduction 23.00 1.449e-01 ## DiagnosisWernicke 23.00 6.384e-01 ## TestTime:DiagnosisConduction 22.99 1.219e-01 ## TestTime:DiagnosisWernicke 22.99 1.122e-02
3. Use the Kenward-Roger approximation to get approximate degrees of freedom and the t-distribution to get p-values, which is implemented in the pbkrtest package.
require(pbkrtest) # get the KR-approximated degrees of freedom df.KR <- get_ddf_Lb(m.sem, fixef(m.sem)) # get p-values from the t-distribution using the t-values and approximated # degrees of freedom coefs$p.KR <- 2 * (1 - pt(abs(coefs$t.value), df.KR)) coefs ## Estimate Std..Error t.value p.z ## (Intercept) 0.045767 0.007776 5.8855 3.970e-09 ## TestTime -0.008685 0.003524 -2.4644 1.372e-02 ## DiagnosisConduction -0.015149 0.010039 -1.5090 1.313e-01 ## DiagnosisWernicke -0.004899 0.010287 -0.4762 6.339e-01 ## TestTime:DiagnosisConduction 0.007308 0.004550 1.6063 1.082e-01 ## TestTime:DiagnosisWernicke 0.012854 0.004662 2.7571 5.832e-03 ## df.Satt p.Satt p.KR ## (Intercept) 23.00 5.344e-06 9.049e-06 ## TestTime 22.99 2.163e-02 2.283e-02 ## DiagnosisConduction 23.00 1.449e-01 1.468e-01 ## DiagnosisWernicke 23.00 6.384e-01 6.390e-01 ## TestTime:DiagnosisConduction 22.99 1.219e-01 1.238e-01 ## TestTime:DiagnosisWernicke 22.99 1.122e-02 1.210e-02
Not surprisingly, the Satterthwaite and Kenward-Roger approximations produce slightly more conservative p-values than the normal approximation does. However, even in this not-very-large data set (24 participants, 6-9 in each of 3 groups, 5 observations for each participant) the normal approximation is only slightly less conservative than these two options. The approximate degrees of freedom are slightly higher when using the Satterthwaite approximation (23) than the Kenward-Roger approximation (20.1), which makes the latter a bit more conservative. Of course, these are not systematic tests, but it is fairly representative of a few models that I've compared.
To leave a comment for the author, please follow the link and comment on their blog: Minding the Brain.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.