Site icon R-bloggers

Plotting Factor Analysis Results

[This article was first published on Minding the Brain, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A recent factor analysis project (as discussed previously here, here, and here) gave me an opportunity to experiment with some different ways of visualizing highly multidimensional data sets. Factor analysis results are often presented in tables of factor loadings, which are good when you want the numerical details, but bad when you want to convey larger-scale patterns – loadings of 0.91 and 0.19 look similar in a table but very different in a graph. The detailed code is posted on RPubs because embedding the code, output, and figures in a webpage is much, much easier using RStudio’s markdown functions. That version shows how to get these example data and how to format them correctly for these plots. Here I will just post the key plot commands and figures those commands produce. 


First, a bar graph showing each measure’s factor loadings with each factor in a separate facet (subplot):


#For each test, plot the loading as length and fill color of a bar
# note that the length will be the absolute value of the loading but the 
# fill color will be the signed value, more on this below
ggplot(loadings.m, aes(Test, abs(Loading), fill=Loading)) + 
  facet_wrap(~ Factor, nrow=1) + #place the factors in separate facets
  geom_bar(stat="identity") + #make the bars
  coord_flip() + #flip the axes so the test names can be horizontal  
  #define the fill color gradient: blue=positive, red=negative
  scale_fill_gradient2(name = "Loading", 
                       high = "blue", mid = "white", low = "red", 
                       midpoint=0, guide=F) +
  ylab("Loading Strength") + #improve y-axis label
  theme_bw(base_size=10) #use a black-and-white theme with set  size


Fig. 1 from Mirman et al., 2015, Nature Communications
Second, the full pairwise correlation matrix with a stacked bar graph showing each measure’s (absolute) loading on each factor:


library(grid) #for adjusting plot margins
#place the tests on the x- and y-axes, 
#fill the elements with the strength of the correlation
p1 <- ggplot(corrs.m, aes(Test2, Test, fill=abs(Correlation))) + 
  geom_tile() + #rectangles for each correlation
  #add actual correlation value in the rectangle
  geom_text(aes(label = round(Correlation, 2)), size=2.5) + 
  theme_bw(base_size=10) + #black and white theme with set  size
  #rotate x-axis labels so they don't overlap, 
  #get rid of unnecessary axis titles
  #adjust plot margins
  theme(axis.text.x = element_text(angle = 90), 
        axis.title.x=element_blank(), 
        axis.title.y=element_blank(), 
        plot.margin = unit(c(3, 1, 0, 0), "mm")) +
  #set correlation fill gradient
  scale_fill_gradient(low="white", high="red") + 
  guides(fill=F) #omit unnecessary gradient legend
 
p2 <- ggplot(loadings.m, aes(Test, abs(Loading), fill=Factor)) + 
  geom_bar(stat="identity") + coord_flip() + 
  ylab("Loading Strength") + theme_bw(base_size=10) + 
  #remove labels and tweak margins for combining with the correlation matrix plot
  theme(axis.text.y = element_blank(), 
        axis.title.y = element_blank(), 
        plot.margin = unit(c(3,1,39,-3), "mm"))
library(gridExtra) #for combining the two plots
grid.arrange(p1, p2, ncol=2, widths=c(2, 1)) #side-by-side, matrix gets more space



Fig. 2 from Mirman et al., in press, Neuropsychologia

To leave a comment for the author, please follow the link and comment on their blog: Minding the Brain.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.