Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
RBloggers|RBloggers-feedburner
Recap:
RAthena
is a R package that interfaces into Amazon Athena. However, it doesn’t use the standard ODBC
and JDBC
drivers like AWR.Athena
and metis
. Instead RAthena
utilises Python’s SDK (software development kit) into Amazon, Boto3
. It does this by using the reticulate
package that provides an interface into Python. What this means is that RAthena
doesn’t require any driver installation or setup. That can be particularly difficult when you are considering setting up the ODBC drivers and you are not familiar with how ODBC works on your current operating system. If you wish to use ODBC, RStudio has provided a good user guide Setting up ODBC Drivers to help set up ODBC drivers on your system. However if you do not wish to go down that route RAthena
might be a good option for you.
New Features in RAthena
:
Anyway, getting back to RAthena
and what does the new update provide. One of the key changes in RAthena
is the method of transferring data to and from AWS Athena. RAthena
now utilising data.table
for this process. The reason for this change is the raw speed data.table
. When transferring data to and from AWS Athena the last thing you want is a bottle neck in R just preparing the data before it even transfers it to AWS Athena. This bottle neck can easily be 50 – 100x longer without the use of data.table
.
The next change is bigint
, and how it is converted from AWS Athena to R. In the past RAthena
would just convert integer64
to bigint
when writing to AWS Athena, however it would then convert bigint
back into R as a normal integer
. Which means it is constrained to 32-bit integers. This has now been fixed. When reading bigint
from AWS Athena RAthena
will now convert it into integer64
.
Sum Up:
RAthena
now provides a faster method in reading and writing data from AWS Athena (thanks data.table
). With the correct handling of AWS Athena bigint
. So please give RAthena
a try and let me know what you think of the package. Suggestions/Bugs/Enhancements are always welcome and they will help the package to improve: https://github.com/DyfanJones/RAthena/issues.
Installation methods:
Just in case you are not aware Rathena
is available on the CRAN and GitHub.
CRAN:
install.packages("RAthena")
GitHub development version:
remotes::install_github("dyfanjones/RAthena")
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.