R: calculations involving months

July 7, 2011 | nsaunders

Ask anyone how much time has elapsed since September last year and they’ll probably start counting on their fingers: “October, November…” and tell you “just over 9 months.” So, when faced as I was today with a data frame (named dates) like this: How to add a 7th column, with ... [Read more...]

Syntax highlighting of R code at

May 20, 2011 | nsaunders

If your WordPress blog is hosted at (like this one), you may know that source code in posts is formatted and highlighted using a shortcode, as explained here. Until recently, R was not on the list of supported languages (neither was Perl), but I noticed today that both ... [Read more...]

Friday fun with: Google Trends

May 19, 2011 | nsaunders

Some years ago, Google discovered that when people are concerned about influenza, they search for flu-related information and that to some extent, search traffic is an indicator of flu activity. Google Flu Trends was born. Illness is sweeping through our department this week and I have succumbed. It’s not ... [Read more...]

Friday fun projects

May 14, 2011 | nsaunders

What’s a “Friday fun project”? It’s a small computing project, perfect for a Friday afternoon, which serves the dual purpose of (1) keeping your programming/data analysis skills sharp and (2) providing a mental break from the grind of your day job. Ideally, the skills learned on the project are ... [Read more...]

R 2.12 to 2.13 package upgrade

April 14, 2011 | nsaunders

If you: use Linux have just upgraded your R installation from 2.12 to 2.13 installed some/all of your packages in your home area (e.g. ~/R/i486-pc-linux-gnu-library/2.12) and… …are wondering why R can’t see them any more just do this: # at a shell prompt cp ~/R/i486-pc-linux-gnu-library/2.12 ~/R/... [Read more...]

The RStudio IDE: first impressions are positive

February 28, 2011 | nsaunders

Integrated development environments (IDEs) are software development tools, providing an interface that enables you to write, debug, run and view the output of your code. Whether you need an IDE or find them useful depends very much on your own preferences and style of working. In my own case for ... [Read more...]

Analysis of retractions in PubMed

November 30, 2010 | nsaunders

As so often happens these days, a brief post at FriendFeed got me thinking about data analysis. Entitled “So how many retractions are there every year, anyway?”, the post links to this article at Retraction Watch. It discusses ways to estimate the number of retractions and in particular, a recent ... [Read more...]

Findings increasingly novel, scientists say…

October 29, 2010 | nsaunders

…was the tongue-in-cheek title of an image that I posted to Twitpic this week. It shows the usage of the word “novel” in PubMed article titles over time. As someone correctly pointed out at FriendFeed, it needs to be corrected for total publications per year. It was inspired by a ...
BioStar users (of the world, unite)

October 9, 2010 | nsaunders

Egon writes: Can someone please plot the BioStar users on a Google Map? Sounds like a challenge. Let’s go. 1. Harvesting user IP addresses BioStar user profiles (here’s mine) include a location field. It’s free text and optional, which means that location is missing or inaccurate for many ...
GEO database: curation lagging behind submission?

August 30, 2010 | nsaunders

I was reading an old post that describes GEOmetadb, a downloadable database containing metadata from the GEO database. We had a brief discussion in the comments about the growth in GSE records (user-submitted) versus GDS records (curated datasets) over time. Below, some quick and dirty R code to examine the ... [Read more...]

Abstract word clouds using R

August 23, 2010 | nsaunders

A recent question over at BioStar asked whether abstracts returned from a PubMed search could easily be visualised as “word clouds”, using Wordle. This got me thinking about ways to solve the problem using R. Here’s my first attempt, which demonstrates some functions from the RCurl and XML packages. ... [Read more...]

A brief introduction to “apply” in R

August 19, 2010 | nsaunders

At any R Q&A site, you’ll frequently see an exchange like this one: Q: How can I use a loop to [...insert task here...] ? A: Don’t. Use one of the apply functions. So, what are these wondrous apply functions and how do they work? I think the ... [Read more...]

Analysing the ISMB 2010 meeting using R

July 20, 2010 | nsaunders

The colossus of bioinformatics meetings, ISMB, convened in Boston this year from July 9 – 13. As in recent years, the meeting was covered online at its website, FriendFeed and Twitter. I thought it would be fun to run a quick analysis of activity at the FriendFeed room using R. 1. Fetch the data ...
biomaRt and GenomeGraphs: a worked example

June 6, 2010 | nsaunders

As promised a few posts ago, another demonstration of the excellent biomaRt package, this time in conjunction with GenomeGraphs. Here’s what we’re going to do: Grab some public microarray data Normalise and get a list of the most differentially-expressed probesets Use biomaRt to fetch the genes associated with ... [Read more...]

Beware of rogue header files (Bioconductor installation)

May 11, 2010 | nsaunders

Just a short note concerning a “gotcha”. As I have many times before, I opened an R console on my newly-upgraded (to lucid 10.04) Ubuntu machine, typed source(“”) and began a Bioconductor install with biocLite(). Only this time, I saw this: Error in dyn.load(file, ... [Read more...]

Experiments with igraph

April 21, 2010 | nsaunders

Networks – social and biological – are all the rage, just now. Indeed, a recent entry at Duncan’s QOTD described the “hairball” network representation as the dominant cultural icon in molecular biology. I’ve not had occasion to explore networks “professionally”, but have always been fascinated by both networks and the ...
