[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As of cdata version 1.0.8 cdata implements an operator notation for data transform.
The idea is simple, yet powerful.
First let’s start with some data.
d <- wrapr::build_frame(
"id", "measure", "value" |
1 , "AUC" , 0.7 |
1 , "R2" , 0.4 |
2 , "AUC" , 0.8 |
2 , "R2" , 0.5 )
knitr::kable(d)
| id | measure | value |
|---|---|---|
| 1 | AUC | 0.7 |
| 1 | R2 | 0.4 |
| 2 | AUC | 0.8 |
| 2 | R2 | 0.5 |
In the above data we have two measurements each for two individuals (individuals identified by the "id" column). Using cdata‘s new_record_spec() method we can capture a description of this record structure.
library("cdata")
record_spec <- new_record_spec(
wrapr::build_frame(
"measure", "value" |
"AUC" , "AUC" |
"R2" , "R2" ),
recordKeys = "id")
print(record_spec)
## $controlTable ## measure value ## 1 AUC AUC ## 2 R2 R2 ## ## $recordKeys ## [1] "id" ## ## $controlTableKeys ## [1] "measure" ## ## attr(,"class") ## [1] "cdata_record_spec"
Once we have this specification we can transform the data using operator notation.
We can collect the record blocks into rows by a "division" (or aggregation/projection) step.
knitr::kable(d)
| id | measure | value |
|---|---|---|
| 1 | AUC | 0.7 |
| 1 | R2 | 0.4 |
| 2 | AUC | 0.8 |
| 2 | R2 | 0.5 |
d2 <- d %//% record_spec knitr::kable(d2)
| id | AUC | R2 |
|---|---|---|
| 1 | 0.7 | 0.4 |
| 2 | 0.8 | 0.5 |
We can expand record rows into blocks by a "multiplication" (or join) step.
knitr::kable(d2)
| id | AUC | R2 |
|---|---|---|
| 1 | 0.7 | 0.4 |
| 2 | 0.8 | 0.5 |
d3 <- d2 %**% record_spec knitr::kable(d3)
| id | measure | value |
|---|---|---|
| 1 | AUC | 0.7 |
| 1 | R2 | 0.4 |
| 2 | AUC | 0.8 |
| 2 | R2 | 0.5 |
And that is truly fluid data manipulation.
This article can be found in a vignette here.
To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
