[This article was first published on mages' blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Cologne R user group met last Friday for two talks on split apply combine in R and XLConnect by Bernd Weiß and Günter Faes respectively, before the usual Schnitzel and Kölsch at the Lux.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Split apply combine in R
The
apply
family of functions in R is incredible powerful, yet for newcomers often somewhat mysterious. Thus, Bernd gave an overview of the different apply functions and their cousins. The various functions differ in their object inputs, e.g. vectors, arrays, data frames or lists, and their outputs. Other related functions are by
, aggregate
and ave
. While functions like aggregate
reduce the output size, others like ave
will return as many rows as the input object and repeat the results where necessary. Alternatively to the base R function Bernd touched also on the
**ply
functions of the plyr
package. The function names are certainly easier to remember, but their syntax can be a little awkward (.()). Bernd’s slides, in German, are already available from our Meetup site. XLConnect
When dealing with data stored in spreadsheets most member of the group rely onread.csv
and write.csv
in R. However, if you have a spreadsheet with multiple tabs and formatted numbers, read.csv
becomes clumsy, as you would have to save each tab without any formatting in separate files. Günter presented the
XLConnect
as an alternative to read.csv
or indeed RODBC
for reading spreadsheet data. It uses the Apache POI API as the underlying interface. XLConnect
requires a Java runtime environment on your computer, but no installation of Excel. That makes it a true platform independent solution to exchange data with spreadsheets and R. Not only can you read defined rows and columns from Excel into R, or indeed named ranges, but in the same way data can be stored in Excel files again and to top it all – also graphic output from R.Next Kölner R meeting
The next meeting is scheduled for 13 December 2013. A discussion of the data.table package is already on the agenda.Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.
Thanks again to Bernd Weiß for hosting the event and Revolution Analytics for their sponsorship.
To leave a comment for the author, please follow the link and comment on their blog: mages' blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.