Phylometa from R: Randomization via Tip Shuffle
[This article was first published on Recology, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I wrote earlier about some code I wrote for running Phylometa (software to do phylogenetic meta-analysis) from R.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I have been concerned about what exactly is the right penalty for including phylogeny in a meta-analysis. E.g.: AIC is calculated from Q in Phylometa, and Q increases with tree size.
So, I wrote some code to shuffle the tips of your tree N number of times, run Phylometa, and extract just the “Phylogenetic MA” part of the output. So, we compare the observed output (without tip shuffling) to the distribution of the tip shuffled output, and we can calculate a P-value from that. The code I wrote simply extracts the pooled effect size for fixed and also random-effects models. But you could change the code to extract whatever you like for the randomization.
I think the point of this code is not to estimate your pooled effects, etc., but may be an alternative way to compare traditional to phylogenetic MA where hopefully simply incorporating a tree is not penalizing the meta-analysis so much that you will always accept the traditional MA as better.
Get the code here, and also below. Get the example tree file and data file, named “phylogeny.txt” and “metadata_2g.txt”, respectively below (or use your own data!). You need the file “phylometa_fxn.r” from my website, get here, but just call it using source as seen below.
##################################################################### # Created by Scott Chamberlain # Ecology and Evolutionary Biology Dept., Rice University # Houston, TX 77005, USA # [email protected] # http://schamber.wordpress.com/ ##################################################################### ##### Directions #Put your phylogeny, in format required by phylometa (see Phylometa manual), in your working directory #Put your meta-analysis dataset, in format required by phylometa (see Phylometa manual), in your working directory #Set working directory #Use below functions #Beware: only use a moderator variable with up to 6 groups # Install packages install.packages(c("plyr","ggplot2","ape","stringr")) # if not already installed # Load packages require(ggplot2); require(ape); require(stringr) # Call and run functions (used below) in the working directory [NOTE:CHANGE TO YOUR WORKING DIRECTORY] source("Z:/Mac/R_stuff/Blog_etc/Phylometa_randomization/phylometa_fxn.r") ####### Fxn to do randomization equivalent of Phylometa analysis # phy = name of phylogeny file # dat = name of data file in Phylometa format # reps = number of randomized trees per real tree # groups = number of levels in grouping/moderator variable # Note that trees are written to directory so that Phylometa can call them # Output is a jpeg of the histograms of randomized results, and vertical line for observed value # Also, p-values for fixed and random-effects are output in the console PhylogMetaRand <- function(phy, dat, reps, groups) { # Observed data obsout <- phylometa.process(system(paste("\"new_phyloMeta_1.2b.exe\"", phy, dat), intern=T), groups) subobsout <- subset(obsout[4][[1]], Group == "All studies") obsfixedeffect <- subobsout[1,3] obsrandeffect <- subobsout[2,3] # Randomization of data by shuffling tips on tree phylo <- read.tree(phy) rtree_list <- list() for(i in 1:reps) { temp <- phylo randtips <- sample(temp$tip.label, length(temp$tip.label)) temp$tip.label <- randtips rtree_list[[i]] <- temp } treenames <- list() for(i in 1:reps){ treenames[[i]] <- paste("tree",i,".txt","",sep="") } WriteTrees2(rtree_list) fixedrandeffects <- data.frame(matrix(NA, reps, 2)) for(i in 1:length(treenames)) { cat(i, "\n") out <- system(paste("\"new_phyloMeta_1.2b.exe\"", treenames[[i]], dat), intern=T) fixrandout <- subset(phylometa.process(out, groups)[4][[1]], Group == "All studies") fixedrandeffects[i,] <- c(fixrandout[1,3], fixrandout[2,3]) } names(fixedrandeffects) <- c("fixed", "random") # Plot results fixed_ <- ggplot(fixedrandeffects, aes(x = fixed)) + geom_histogram() + geom_vline(xintercept = obsfixedeffect, colour="red", size=1) random_ <- ggplot(fixedrandeffects, aes(x = random)) + geom_histogram() + geom_vline(xintercept = obsrandeffect, colour="red", size=1) jpeg(paste(dat, ".jpeg", sep="")) arrange_(fixed_, random_) dev.off() # Calculate randomization P-value num_greater_than_fixed <- length(which(fixedrandeffects$fixed >= obsfixedeffect)) pval_fixed <- num_greater_than_fixed / reps num_greater_than_random <- length(which(fixedrandeffects$random >= obsrandeffect)) pval_random <- num_greater_than_random / reps pvals <- c(pval_fixed, pval_random) names(pvals) <- c("pval_fixed","pval_random") pvals }
An example:
setwd("Z:/Mac/R_stuff/Blog_etc/Phylometa_randomization/") PhylogMetaRand(phy = "phylogeny.txt", dat = "metadata_2g.txt", reps = 100, groups = 2)
As you can see, the observed values fall well within the distribution of values obtained from shuffling tips. P-values were 0.64 and 0.68 for fixed- and random-effects MA's, respectively.
To leave a comment for the author, please follow the link and comment on their blog: Recology.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.