Row names in data frames: beware of 1:nrow
[This article was first published on The stupidest thing... » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x
, as 1:nrow(x)
, R will act as if you’d not assigned row names, and the names might get changed when you do rbind
.
Here’s an illustration:
> x <- data.frame(id=1:3) > y <- data.frame(id=4:6) > rownames(x) <- 1:3 > rownames(y) <- LETTERS[4:6] > rbind(x,y) id 1 1 2 2 3 3 D 4 E 5 F 6 > rbind(y,x) id D 4 E 5 F 6 4 1 5 2 6 3
As you can see, if you give x
the row names 1:3
, these are treated as generic row numbers and could get changed following rbind
if they end up in different rows. This doesn’t happen if x
and y
are matrices.
I often use row names as identifiers, so it seems I must be cautious to use something other than row numbers.
To leave a comment for the author, please follow the link and comment on their blog: The stupidest thing... » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.