Site icon R-bloggers

R – Analyze any data frame in Saiku

[This article was first published on R (en) - Analytik dat, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In my previous article I have shown how R can be used to analyze PostgreSQL tables in Saiku using dynamically generated OLAP cubes. Today I will show you how you can analyze any R data frame in Saiku. WIth Saiku you can easily create excel-like pivot tables, charts, export PDF/pictures, zoom through tables and do other cool things.

 Pre-requisites

This is what you need to have installed to make it work:

Note: I have Ubuntu 13.04 and unfortunately I cannot test this on any other system right now.

Installation

Download the code and create function GenerateCubeSQLite function in R. Remember that you need packages sqldf and RSQLite installed.

Examples

First example uses already available data frame mtcars. Run the following code (you need to set valid values for CubeDestination and DataSourceDestination parameters):

mtcars2 = mtcars
mtcars2$id=seq(1:nrow(mtcars2))
mtcars2$vs=as.integer(mtcars2$vs)
mtcars2$am=as.integer(mtcars2$am)
mtcars2$gear=as.integer(mtcars2$gear)
mtcars2$carb=as.integer(mtcars2$carb)
# Do not forget to change CubeDestination and DataSourceDestination to valid values
GenerateCubeSQLite(mtcars2,PrimaryKey='id',CubeDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/foodmart/test.xml',DataSourceDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/saiku-datasources/test')
If everything goes well, you just have to reload cubes in Saiku, choose newly generated Cube named mtcars2:
 
Second example is based on my own sample data frame named big_portfolio. You can download the dataset here. Store the dataset in your working directory, import the dataset and call the function GenerateCubeSQLite again:
# Example 2
big_portfolio=read.csv(file='big_portfolio.csv',header=TRUE,sep=';',stringsAsFactors=FALSE)
big_portfolio$origination_date=as.Date(big_portfolio$origination_date)
big_portfolio$repayment_date=as.Date(big_portfolio$repayment_date,format='%Y-%m-%d')
str(big_portfolio)
GenerateCubeSQLite(big_portfolio,PrimaryKey='id',CubeDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/foodmart/test.xml',DataSourceDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/saiku-datasources/test')
If everything goes well, refresh cubes in Saiku and play with data. You can get something like this for example:
 
Note that this time we also have some date columns. GenerateCubeSQLite generated time dimension automatically for us, so we can easily group data by months/quarters/years

How it works

Saiku can work with data stored in SQLite database. The function GenerateCubeSQLite stores R data frame in temporary SQLite databases together with on-the-fly generated time dimension.

Conclusion

From now on it is easy to analyze any data frame you have in Saiku. Let me know if this works for you.

 

To leave a comment for the author, please follow the link and comment on their blog: R (en) - Analytik dat.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.