Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A participant in the R course I’m teaching showed me a case where a tbl_df (the new flavour of data frame provided by the tibble package; standard in new RStudio versions) interacts badly with the t.test function. I had not seen this happen before. The reason is this:
Interacting with legacy code
A handful of functions are don’t work with tibbles because they expect df[, 1] to return a vector, not a data frame. If you encounter one of these functions, use as.data.frame() to turn a tibble back to a data frame (tibble announcement on RStudio blog)
Here is code that reproduces the situation (tibble version 1.2):
data(chickwts) chick_tibble <- as_tibble(chickwts) casein <- subset(chickwts, feed == "casein") sunflower <- subset(chick_tibble, feed == "sunflower") t.test(sunflower$weight, casein$weight) ## this works t.test(as.data.frame(sunflower[, 1]), as.data.frame(casein[, 1])) ## this works too t.test(sunflower[, 1], casein[, 1]) ## this doesn't
Error: Unsupported use of matrix or array for column indexing
I did not know that. The solution, which they found themselves, is to use as.data.frame.
I can see why not dropping to a vector makes sense. I’m sure you’ve at some point expected a data frame and got an ”$ operator is invalid for atomic vectors”. But it’s an unfortunate fact that number of weird little thingamajigs to remember is always strictly increasing as the language evolves. And it’s a bit annoying that the standard RStudio setup breaks an old stats function, even if it’s in somewhat non-obvious way.
Postat i:computer stuff, data analysis Tagged: R, tibble
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.