What we’re reading — and how it ties us together
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
tl;dr: Network of an interdisciplinary environmental social science lab as tied together by the journals we read. A few key journals, especially Social Networks, hold us together. R code follows.
The Center for Environmental Policy and Behavior, my grad lab, is remarkably interdisciplinary. For some sense of our breadth, consider that our nine core graduate students represent five different graduate programs: Ecology, Geography, Hydrology, Political Science, and Transportation Technology and Policy. That’s great for many reasons, not least that it’s an intellectually exciting environment in which to live, but it sometimes leaves me wondering what ties us together. So I thought I’d see if the journals we read could answer that question.
I asked everyone in the lab what journals they routinely read and used the responses to construct a bipartite network (with people and journals being the two modes). This was my first time working with bipartite network data, so I played around with different projections, visualizations, and centrality measures to get a better understanding of how we fit together via what we read, as well as how different approaches to examining bipartite networks succeed and fail.
Here’s the raw bipartite network, with people in red and journals in blue. What immediately stands out is that most of the lab is reading a whole bunch of stuff that no one else in the lab is reading. That’s potentially really useful – we go off and gather information and methods from different fields and bring back the best of it (hopefully) to share. The risk is that people might feel isolated and/or lack opportunities for conversations about their interests, and there’s often no one to fall back on in case you missed or misunderstood something (ie, there’s a lack of redundancy). To some extent those risks are ameliorated by other contacts – many of us have secondary labs we interact with, and there are also our graduate groups, various IGERTs, classes, campus initiatives, etc. – but since we share office space we naturally spend most of our workdays interacting within the lab, so these contacts (I suspect) represent the bulk of our academic contact.
Having noted how much of each person’s reading is unique to them, let’s clean things up by removing those journals. We are left with only journals that two or more people read. Now it’s easy to see communities and the key journals that connect them. In very rough terms, we’ve got an ecology cluster in the upper left around Ecology and Society; a small political science cluster below it around APSR; perhaps a climate change community in the lower left; a policy/governance community on the right; and network analysis up top.
To get a better sense of how the people in the lab are connected through journals, I plotted a one-mode projection of the bipartite network. I don’t love statnet for visualization. It is adequate for basic stuff like the above plots, but it’s not what it’s built for. I don’t particularly like iGraph’s visualizations either, and I wanted to keep the workflow in R, which as far as I know rules out Gephi, so I did a bit of searching and stumbled upon ggnet – a ggplot implementation (via the GGally package) of network plotting. It doesn’t nearly harness all the power of ggplot, and it only took a couple minutes to run into its limits, but it’s a nice start and so far for me at least, it beats statnet’s native plotting functions.
One thing that stands out here is (reassuringly) that the two professors who run the lab are at the center of the network. No surprise there. There also seems to be a major community formed around the Ecology Graduate Group… All of the graduate students and alumni in the lower-right part of the graph are/were Ecology students. I suspect my (Michael) centrality in the network is largely a function of my having seen everyone else’s lists of journals before creating my own, making the most common titles easily accessible to me when I was writing my list.
Rather than just speculating, let’s bring some network analysis tools to bear on the question of what the key journals are. I calculated three commonly-used centrality scores for all the journals mentioned by more than one person, first on a one-mode projection of journals, then for journals in the two-mode network.
By all measures, Social Networks is the most central journal in our lab. This is interesting. A majority of lab members use social network analysis (SNA) as a primary tool, and on one hand, it makes sense that for a group whose topics differ, a methodological approach unites. On the other hand, there are quite a few people in the lab who don’t use SNA at all, and given that we are the Center for Environmental Policy and Behavior, one might expect a policy- and/or environment-focused journal to be more central.
PNAS, being a fully interdisciplinary journal, ranks unsurprisingly high. I find it interesting though that its eigenvector (EV) centrality is lower than the other measures. EV centrality gets at how central the nodes you connect to are, so perhaps it makes sense that a journal that many people look at but is of key importance for no one ranks lower on EV centrality than degree centrality. Conversely, PSJ has high EV centrality but scores somewhat lower by the other measures, I suspect on account of its connecting Gwen at the network core to the non-Ecology part of the network. Finally, Ecology and Society ranks highly across measures, due its being nominated by many people in the “ecology community” of the lab.
I’m not sure why eigenvector centrality scores couldn’t be calculated for the bipartite network, but statnet threw a warning message about matrix pathology, and there’s clearly something wrong since AJPS is structurally equivalent to JPART but their scores are different.
Here are the R scripts I wrote for these analyses. They should be plug-and-play if you want to do a similar analysis for your lab, or any bipartite network for that matter. The input file, journalsByPerson.csv, is structured with names in the first row and the journals each person reads listed under their names.
I try to code by the “don’t repeat yourself” maxim, but at the end of the first script, I manually call a plotting function repeatedly with different arguments. I know there’s a way to do it with do.call(), but I got too fed up trying to structure the arguments list and had to move on. Suggestions are welcome.
Main script
Functions
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.