When the “reorder” function just isn’t good enough…
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something). Take the following simple data frame:
df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”))
I expect that if I call the reorder function on the a2 vector, using the a1 vector as the vector to order the second one by, then any summary stats that I run on the a2 vector will be ordered according to the numbers in a1. However, look what happens:
table(reorder(df$a2, df$a1)) c e h j 2 1 2 2
I found out that in order to get it in the order specified by the numbers in the first vector, the following code seems to work:
df$a2 = factor(df$a2, levels=unique(df$a2)[order(unique(df$a1))], ordered=TRUE)
Now look at the result:
table(df$a2) j c e h 2 2 1 2
One thing I notice here is that R seems to be keeping the factor levels alphabetically organized. When I specify the levels by using the “unique” function, it allows itself to break the alphabetic organization.
Why won’t the reorder function work in this case?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.