Phylogenetic community structure: PGLMMs
[This article was first published on Recology, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
So, I’ve blogged about this topic before, way back on 5 Jan this year.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Matt Helmus, a postdoc in the Wootton lab at the University of Chicago, published a paper with Anthony Ives in Ecological Monographs this year (abstract here). The paper addressed a new statistical approach to phylogenetic community structure.
As I said in the original post, part of the power of the PGLMM (phylogenetic generalized linear mixed models) approach is that you don’t have to conduct quite so many separate statistical tests as with the previous null model/randomization approach.
Their original code was written in Matlab. Here I provide the R code that Matt has so graciously shared with me. There are four functions and a fifth file has an example use case. The example and output are shown below.
Look for the inclusion of Matt’s PGLMM to the picante R package in the future.
Here are links to the files as GitHub gists:
PGLMM.data.R: https://gist.github.com/1278205
PGLMM.fit.R: https://gist.github.com/1284284
PGLMM.reml.R: https://gist.github.com/1284287
PGLMM.sim.R: https://gist.github.com/1284288
PGLMM_example.R: https://gist.github.com/1284442
Enjoy!
The example
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> # Simulate a dataset for the example | |
> modelflag <- 1 | |
> sim.dat <- PGLMM.sim(stree(16, "balanced"), nsites = 30, modelflag = modelflag, | |
+ second.env = TRUE, compscale = 1) | |
> sim.dat$Vphylo # I think this is the variance-covariance matrix from the phylogeny | |
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 | |
t1 1.00 0.75 0.50 0.50 0.25 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t2 0.75 1.00 0.50 0.50 0.25 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t3 0.50 0.50 1.00 0.75 0.25 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t4 0.50 0.50 0.75 1.00 0.25 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t5 0.25 0.25 0.25 0.25 1.00 0.75 0.50 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t6 0.25 0.25 0.25 0.25 0.75 1.00 0.50 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t7 0.25 0.25 0.25 0.25 0.50 0.50 1.00 0.75 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t8 0.25 0.25 0.25 0.25 0.50 0.50 0.75 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | |
t9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.75 0.50 0.50 0.25 0.25 0.25 0.25 | |
t10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.75 1.00 0.50 0.50 0.25 0.25 0.25 0.25 | |
t11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 0.50 1.00 0.75 0.25 0.25 0.25 0.25 | |
t12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 0.50 0.75 1.00 0.25 0.25 0.25 0.25 | |
t13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.25 0.25 0.25 1.00 0.75 0.50 0.50 | |
t14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.25 0.25 0.25 0.75 1.00 0.50 0.50 | |
t15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.25 0.25 0.25 0.50 0.50 1.00 0.75 | |
t16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.25 0.25 0.25 0.50 0.50 0.75 1.00 | |
> # PGLMM.sim also automatically outputs three figures, shown below | |
> # Organizes the data so that PGLMM can be fit | |
> dat <- PGLMM.data(modelflag=modelflag,sim.dat=sim.dat) | |
> str(dat) # Structure of the data | |
List of 3 | |
$ YY: num [1:304, 1] 0 0 0 0 1 1 1 0 0 0 ... | |
$ VV:List of 2 | |
..$ Vfullspp : num [1:304, 1:304] 1 0.75 0.5 0.5 0.25 0.25 0.25 0.25 0 0 ... | |
..$ Vfullsite: num [1:304, 1:304] 1 1 1 1 1 1 1 1 1 1 ... | |
$ XX: num [1:304, 1:16] 1 0 0 0 0 0 0 0 0 0 ... | |
> # Simulate a dataset for the example | |
> # low number of iterations, maxit = 25 is probably good, but may need | |
> # to increase exitcountermax | |
> out <- PGLMM.fit(dat = dat, maxit = 25, exitcountermax = 30) | |
Loading required package: corpcor | |
exitcounter sigma 1 sigma 2 B 1 B 2 B 3 B 4 | |
0.000000 1.935156 0.315625 -1.321756 -2.140066 -1.029619 -1.029619 | |
[1] 1.000000000 2.531105003 0.003184433 -1.416240951 -2.293742824 -1.102306778 -1.101311848 | |
[1] 2.000000000 1.441218975 0.001114315 -1.497821485 -2.407445316 -1.174431967 -1.172422085 | |
[1] 3.0000000000 1.2060004973 0.0009031993 -1.4099134066 -2.2651017713 -1.1023243928 -1.1012119516 | |
[1] 4.0000000000 1.0707744528 0.0004762704 -1.4010924746 -2.2545781854 -1.0954740784 -1.0945435442 | |
[1] 5.0000000000 1.0624400709 0.0002050476 -1.3901393670 -2.2381045469 -1.0864806439 -1.0856860763 | |
[1] 6.000000e+00 1.141377e+00 3.644753e-05 -1.389817e+00 -2.237711e+00 -1.086224e+00 -1.085433e+00 | |
[1] 7.0000000000 1.2064541057 0.0000712796 -1.3962534327 -2.2474380739 -1.0915088447 -1.0906400661 | |
[1] 8.0000000000 1.2064541057 0.0000712796 -1.4017207377 -2.2557858161 -1.0959973521 -1.0950623973 | |
[1] 9.000000e+00 1.190569e+00 3.077974e-05 -1.401810e+00 -2.255942e+00 -1.096072e+00 -1.095136e+00 | |
[1] 10.0000000000 1.1694629518 0.0000782201 -1.4004829979 -2.2539137657 -1.0949830703 -1.0940630885 | |
[1] 11.0000000000 1.1694629518 0.0000782201 -1.3987246019 -2.2512324218 -1.0935389047 -1.0926402314 | |
> str(out) # Structure of the data | |
List of 5 | |
$ B : num [1:16, 1] -1.399 -2.251 -1.094 -1.093 -0.822 ... | |
$ B0 : num [1:16, 1] -1.322 -2.14 -1.03 -1.03 -0.773 ... | |
$ s : num [1:2, 1:3] 1.17 7.82e-05 1.05 0.00 1.29 ... | |
..- attr(*, "dimnames")=List of 2 | |
.. ..$ : NULL | |
.. ..$ : chr [1:3] "" "0.05" "0.95" | |
$ LL : num [1, 1] 454 | |
$ flag: chr "converged" | |
> out$B # coefficients from PGLMM | |
[,1] | |
[1,] -1.3987338 | |
[2,] -2.2512486 | |
[3,] -1.0935466 | |
[4,] -1.0926478 | |
[5,] -0.8215762 | |
[6,] -1.0993153 | |
[7,] -2.2693828 | |
[8,] -2.2693828 | |
[9,] -2.2007681 | |
[10,] -1.7226929 | |
[11,] -2.9483086 | |
[12,] -2.1973124 | |
[13,] -2.2204310 | |
[14,] -1.7485642 | |
[15,] -2.2259044 | |
[16,] -2.2230874 | |
> out$B0 # coefficients from standard logistic regression | |
[,1] | |
[1,] -1.3217558 | |
[2,] -2.1400662 | |
[3,] -1.0296194 | |
[4,] -1.0296194 | |
[5,] -0.7731899 | |
[6,] -1.0296194 | |
[7,] -2.1400662 | |
[8,] -2.1400662 | |
[9,] -2.1400662 | |
[10,] -1.6739764 | |
[11,] -2.8903718 | |
[12,] -2.1400662 | |
[13,] -2.1400662 | |
[14,] -1.6739764 | |
[15,] -2.1400662 | |
[16,] -2.1400662 | |
> out$s # parameter(s) of the covariance matrix | |
0.05 0.95 | |
[1,] 1.1694629518 1.05248 1.286471 | |
[2,] 0.0000782201 0.00000 1.153910 | |
> out$LL # log-likelihood | |
[,1] | |
[1,] 453.7648 | |
> out$flag # did model converge? | |
[1] "converged" |
..and the figures…
To leave a comment for the author, please follow the link and comment on their blog: Recology.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.