Reading data from google sheets into R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Reading data from google sheets into R is something you imagine should be really simple, but often is anything but. However, package googlesheets goes a long way to solving this problem.
Let’s crack on with an example.
First, install the software:
install.packages("googlesheets")
We then need an example sheet to work with, and I’m going to use one from Britain Elects:
For the latest council by election results and polls, be they Westminster or #IndyRef, see: https://t.co/dPyiwIqJ3O pic.twitter.com/mvyoCuIJgd
— Britain Elects (@britainelects) August 19, 2016
So go here and add this to your own collection (yes you’ll need a google account!) using the little “Add to my drive” icon to the right of the title, just next to the star.
Right, let’s play around a bit!
# load package library(googlesheets) # which google sheets do you have access to? # may ask you to authenticate in a browser! gs_ls() # get the Britain Elects google sheet be <- gs_title("Britain Elects / Public Opinion")
This should show you all of the sheets you have access to, and also select the one we want.
Now we can see which worksheets exist within the sheet:
# list worksheets gs_ws_ls(be)
We can "download" one of the sheets using gs_read()
# get Westminster voting west <- gs_read(ss=be, ws = "Westminster voting intentions", skip=1) # convert to data.frame wdf <- as.data.frame(west)
And hey presto, we have the data. Simples!
Now let's plot it:
# reverse so that data are forward-sorted by time wdf <- wdf[2769:1,] # treat all dates as a single point on x-axis dates <- 1:2769 # smooth parameter fs <- 0.04 # plot conservative plot(lowess(dates, wdf$Con, f=fs), type="l", col="blue", lwd=5, ylim=c(0,65), xaxt="n", xlab="", ylab="%", main="Polls: Westminster voting intention") # add labels axis(side=1, at=dates[seq(1, 2769, by=40)], labels=paste(wdf$"Fieldwork end date", wdf$Y)[seq(1, 2769, by=40)], las=2, cex.axis=0.8) # plot labour and libdem lines(lowess(dates, wdf$Lab, f=fs), col="red", lwd=5) lines(lowess(dates, wdf$LDem, f=fs), col="orange", lwd=5) # add UKIP, and we treat absent values as -50 for plotting purposes ukip <- wdf$UKIP ukip[is.na(ukip)] <- -50 lines(lowess(dates, ukip, f=fs), col="purple", lwd=5) # add legend legend(1, 65, legend=c("Con","Lab","LibDem","UKIP"), fill=c("blue","red","orange","purple"))
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.