Update on coordinatized or fluid data
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We have just released a major update of the cdata
R package to CRAN.
If you work with R
and data, now is the time to check out the cdata
package.
0.5.*
version of cdata
package:
- All coordinatized data or fluid data operations are now in the
cdata
package (no longer split between thecdata
andreplyr
packages). - The transforms are now centered on the more general table driven
moveValuesToRowsN()
andmoveValuesToColumnsN()
operators (though pivot and un-pivot are now made available as convenient special cases). - All the transforms are now implemented in
SQL
throughDBI
(no longer usingtidyr
ordplyr
, though we do include examples of usingcdata
withdplyr
). - This is (unfortunately) a user visible API change, however adapting to the changed API is deliberately straightforward.
cdata
now supplies very general data transforms on both in-memory data.frame
s and remote or large data systems (PostgreSQL
, Spark/Hive
, and so on). These transforms include operators such as pivot/un-pivot that were previously not conveniently available for these data sources (for example tidyr
does not operate on such data, despite dplyr
doing so).
To help transition we have updated the existing documentation:
The fluid data document is a bit long, as it covers a lot of concepts quickly. We hope to develop more targeted training material going forward.
In summary: cdata
theory and package now allow very concise and powerful transformations of big data using R
.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.