Converting strsplit() output to a data.frame
[This article was first published on Gregor Gorjanc (gg), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
R has a nice set of utilities to work with strings. Function paste is surely one among these. It can be used to “glue” several strings with optional separator. The following example shows how paste can be used to create a new variable in a dataset:
dat <- data.frame(x=1:5, y=letters[1:5]) (dat$z <- with(dat, paste(x, y, sep="-")))
Today I was in a situation where I only had column z and wanted to reverse the action of paste. Is there a way to do it? Not directly (AFAIK), but strsplit seems to be quite useful for this:
(tmp <- strsplit(x=dat$z, split="-"))
However, the output of strsplit is a list object with elements (vectors) by the elements of my column z and not by split components. Consequently one can not convert strsplit output easily back to a data.frame as you can test yourself with:
as.data.frame(tmp)
Argh. I understand that strsplit is meant to be very general (say we could have unequal number of components in one element, e.g., c("1-a-0", "1-a")), but its output is really inconvenient for transformation to a data.frame. I came up with the following solution, which seems to work nicely and is quite fast.
tmp <- unlist(strsplit(dat$z, split="-")) cols <- c("x2", "y2") nC <- length(cols) ind <- seq(from=1, by=nC, length=nrow(dat)) for(i in 1:nC) { dat[, cols[i]] <- tmp[ind + i - 1] }
Does anyone have a better (more obvious) solution?
To leave a comment for the author, please follow the link and comment on their blog: Gregor Gorjanc (gg).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.