Import/Export data to and from xlsx files
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As Ive already written, getting data into R from your precious xlsx files is really handy. No need to clutter up your computer with txt or csv files. The previous post I wrote about the gdata package for importing data from xlsx files and was pointed to, among others, the xlsx package. xlsx seems to be a good package, easy to use and, importantly, fast. Its based on java, but it comes with all the relevant jar files in an accompanying package which installs on its own if you have the install dependencies setting to TRUE.
To read in with xlsx its the same as any other read function, you just need to tell it which sheet to read, by either name (sheetName argument) or number (sheetIndex):
library(xlsx) dat <- read.xlsx("testfile.xlsx", sheetName="")
There are various other options that other packages for importing excel files dont seem to have such as rowIndex and colIndex for specifying which rows or columns you want to import. There is also a second function (read.xlsx2) which is apparently an order of magnitude faster for those particularly big files. Once youve selected the data and run the code, you can happily work with the data.
Writing to xlsx files might be useful too, for storage or data sharing with people who dont use R for instance. This is dead easy with xlsx!
If you want just a single dataframe in the workbook you simply do something like the following:
data(cars) write.xlsx(cars, "cars_dataframe.xlsx")
To create a new file containing multiple dataframes from R, you first create the workbook, add sheets to that workbook and then add the dataframes to the sheets and save the workbook to whatever file you want.
cars <- createWorkbook() cars1 <- createSheet(wb=cars, sheetName="Cars") cars2 <- createSheet(wb=cars, sheetName="MTCars") data(cars); data(mtcars) addDataFrame(x=cars, sheet=cars1) addDataFrame(x=mtcars, sheet=cars2) saveWorkbook(cars, "Cars_datasets.xlsx")
By default this will add both column and row names, but this can be overridden using the row.names or col.names arguments in the addDataFrame function. You can also add the dataframes to a particular starting place in the sheet using the startRow and startCol arguments to the addDataFrame function.
Theres also some funky styling stuff you can do using the CellStyle, Fill, Alignment, Font and setCellStyle functions of the following sort (from ?CellStyle).
wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") rows <- createRow(sheet, rowIndex=1) cell.1 <- createCell(rows, colIndex=1)[[1,1]] setCellValue(cell.1, "Hello R!") cs <- CellStyle(wb) + Font(wb, heightInPoints=20, isBold=TRUE, isItalic=TRUE, name="Courier New", color="orange") + Fill(backgroundColor="lavender", foregroundColor="lavender", pattern="SOLID_FOREGROUND") + Alignment(h="ALIGN_RIGHT") setCellStyle(cell.1, cs) # you need to save the workbook now if you want to see this art
Enjoy!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.