R meets HANA
[This article was first published on Blag's bag of rants, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If you read my last blog called HANA meets R you will remember that we read data from HANA into R directly, without having to download an .csv file, but using ODBC. This time, we’re going to read data from HANA as well, but after do some nice tricks on R, we’re going to post back the information into HANA.
Keep in mind, that is not an standard SAP solution. This only relies on a custom R package that can work with ODBC enabled tables, and like any custom packages, there are many limitations…anyway…this should be fixed when SAP released the official R into HANA integration.
In my previous blog Prediction model with HANA and R we create a stored procedure in HANA to populate a table called TICKETS_BY_YEAR, then on R we calculate the prediction for the next year and generate a nice graphic showing both the real data and the prediction. So…of course I’m not going to repeat all that.
This is the R code that we need to use…
library("RODBC") ch<-odbcConnect("HANA",uid="P075400",pwd="HrCOpPk4") Flight_Tickets<-sqlFetch(ch,"P075400.TICKETS_BY_YEAR") period=Flight_Tickets$PERIOD tickets=Flight_Tickets$TICKETS var_year=substr(period[1],1,4) var_year=as.integer(var_year) var_year=var_year+1 var_year=as.character(var_year) new_period=gsub("^\\d{4}",var_year,period) next_year=data.frame(year=new_period,stringsAsFactors=FALSE) prt.lm=lm(tickets ~ period) pred=predict(prt.lm,next_year,interval="none") period=next_year tickets=pred PREDICTION_TICKETS<-data.frame(period,tickets) sqlDrop(ch,"PREDICTION_TICKETS",errors=FALSE) sqlSave(ch,PREDICTION_TICKETS,rownames="id") odbcClose(ch)
After we execute this code, we can check on HANA that our new table called PREDICTION_TICKETS was created...
And the data was populated as expected...
You may wonder...which are the limitations? Everything seems to work like a charm? Easy...not a lot, but important limitations...
* We don't have a way to validate if the table exists or not.
* We must delete the table before doing the insert, otherwise is not going to work.
* Even when the date field was called PERIOD, R named it "year" and pass it into HANA.
* We can't specify the type of the fields, nor the lenght
* We are forced to have an additional column with a numeric index, that we can nicely call "Id"...
As I said early...this is just a custom package that allows us to play...this shouldn't be used as a final solution, but as a playground. Enjoy!
Greetings,
Blag.
To leave a comment for the author, please follow the link and comment on their blog: Blag's bag of rants.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.