Site icon R-bloggers

Phylometa from R: Randomization via Tip Shuffle

[This article was first published on Recology, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I wrote earlier about some code I wrote for running Phylo(software to do phylogenetic meta-analysis) from R.

I have been concerned about what exactly is the right penalty for including phylogeny in a meta-analysis. E.g.: AIC is calculated from Q in Phylometa, and Q increases with tree size.

So, I wrote some code to shuffle the tips of your tree N number of times, run Phylometa, and extract just the “Phylogenetic MA” part of the output. So, we compare the observed output (without tip shuffling) to the distribution of the tip shuffled output, and we can calculate a P-value from that. The code I wrote simply extracts the pooled effect size for fixed and also random-effects models. But you could change the code to extract whatever you like for the randomization.

I think the point of this code is not to estimate your pooled effects, etc., but may be an alternative way to compare traditional to phylogenetic MA where hopefully simply incorporating a tree is not penalizing the meta-analysis so much that you will always accept the traditional MA as better.

Get the code here, and also below. Get the example tree file and data file, named “phylogeny.txt” and “metadata_2g.txt”, respectively below (or use your own data!). You need the file “phylometa_fxn.r” from my website, get here, but just call it using source as seen below.
#####################################################################
# Created by Scott Chamberlain       
# Ecology and Evolutionary Biology Dept., Rice University    
# Houston, TX 77005, USA       
# myrmecocystus@gmail.com  
# http://schamber.wordpress.com/
#####################################################################
 
##### Directions
#Put your phylogeny, in format required by phylo(see Phylomanual), in your working directory
#Put your meta-analysis dataset, in format required by phylo(see Phylomanual), in your working directory
#Set working directory
#Use below functions
#Beware: only use a moderator variable with up to 6 groups 
 
# Install packages
install.packages(c("plyr","ggplot2","ape","stringr")) # if not already installed
 
# Load packages
require(ggplot2); require(ape); require(stringr)
 
# Call and run functions (used below) in the working directory [NOTE:CHANGE TO YOUR WORKING DIRECTORY]
source("Z:/Mac/R_stuff/Blog_etc/Phylometa_randomization/phylometa_fxn.r")
 
####### Fxn to do randomization equivalent of Phyloanalysis
# phy = name of phylogeny file
# dat = name of data file in Phyloformat
# reps = number of randomized trees per real tree
# groups = number of levels in grouping/moderator variable
# Note that trees are written to directory so that Phylocan call them
# Output is a jpeg of the histograms of randomized results, and vertical line for observed value
#  Also, p-values for fixed and random-effects are output in the console
PhylogMetaRand <- function(phy, dat, reps, groups) {
 
 # Observed data
 obsout <- phylometa.process(system(paste("\"new_phyloMeta_1.2b.exe\"", phy, dat), intern=T), groups)
 subobsout <- subset(obsout[4][[1]], Group == "All studies")
 obsfixedeffect <- subobsout[1,3]
 obsrandeffect <- subobsout[2,3]
 
 # Randomization of data by shuffling tips on tree
      phylo <- read.tree(phy)
      rtree_list <- list()
     for(i in 1:reps) {
         temp <- phylo
         randtips <- sample(temp$tip.label, length(temp$tip.label))
         temp$tip.label <- randtips
         rtree_list[[i]] <- temp
     }
 
  treenames <- list()
     for(i in 1:reps){
         treenames[[i]] <- paste("tree",i,".txt","",sep="")
     }
     WriteTrees2(rtree_list)
 
     fixedrandeffects <- data.frame(matrix(NA, reps, 2))
     for(i in 1:length(treenames)) {
         cat(i, "\n")
         out <- system(paste("\"new_phyloMeta_1.2b.exe\"", treenames[[i]], dat), intern=T)
         fixrandout <- subset(phylometa.process(out, groups)[4][[1]], Group == "All studies")
         fixedrandeffects[i,] <- c(fixrandout[1,3], fixrandout[2,3])
     }
     names(fixedrandeffects) <- c("fixed", "random")
 
 # Plot results
 fixed_ <- ggplot(fixedrandeffects, aes(x = fixed)) + 
     geom_histogram() +
     geom_vline(xintercept = obsfixedeffect, colour="red", size=1)
 random_ <- ggplot(fixedrandeffects, aes(x = random)) + 
     geom_histogram() +
     geom_vline(xintercept = obsrandeffect, colour="red", size=1)
 jpeg(paste(dat, ".jpeg", sep=""))
 arrange_(fixed_, random_)
 dev.off()
 
 # Calculate randomization P-value
 num_greater_than_fixed <- length(which(fixedrandeffects$fixed >= obsfixedeffect))
 pval_fixed <- num_greater_than_fixed / reps
 num_greater_than_random <- length(which(fixedrandeffects$random >= obsrandeffect))
 pval_random <- num_greater_than_random / reps
 pvals <- c(pval_fixed, pval_random)
 names(pvals) <- c("pval_fixed","pval_random")
 pvals
}


An example:
setwd("Z:/Mac/R_stuff/Blog_etc/Phylometa_randomization/")
PhylogMetaRand(phy = "phylogeny.txt", dat = "metadata_2g.txt", reps = 100, groups = 2)

As you can see, the observed values fall well within the distribution of values obtained from shuffling tips.  P-values were 0.64 and 0.68 for fixed- and random-effects MA’s, respectively.

To leave a comment for the author, please follow the link and comment on their blog: Recology.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.