2011-01-28

Converting strsplit() output to a data.frame

R has a nice set of utilities to work with strings. Function paste is surely one among these. It can be used to "glue" several strings with optional separator. The following example shows how paste can be used to create a new variable in a dataset:
dat <- data.frame(x=1:5, y=letters[1:5])
(dat$z <- with(dat, paste(x, y, sep="-")))
Today I was in a situation where I only had column z and wanted to reverse the action of paste. Is there a way to do it? Not directly (AFAIK), but strsplit seems to be quite useful for this:
(tmp <- strsplit(x=dat$z, split="-"))
However, the output of strsplit is a list object with elements (vectors) by the elements of my column z and not by split components. Consequently one can not convert strsplit output easily back to a data.frame as you can test yourself with:
as.data.frame(tmp)
Argh. I understand that strsplit is meant to be very general (say we could have unequal number of components in one element, e.g., c("1-a-0", "1-a")), but its output is really inconvenient for transformation to a data.frame. I came up with the following solution, which seems to work nicely and is quite fast.
tmp <- unlist(strsplit(dat$z, split="-"))
cols <- c("x2", "y2")
nC <- length(cols)
ind <- seq(from=1, by=nC, length=nrow(dat))
for(i in 1:nC) {
  dat[, cols[i]] <- tmp[ind + i - 1]
}
Does anyone have a better (more obvious) solution?