Using paste( ) to read and write multiple files in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This post is a quick tip on how to use the paste( ) function to read and write multiple files. First, let’s create some data.
dataset = data.frame(expand.grid(Trt=c(rep("Low",10),rep("High",10)), Sex=c(rep("Male",10),rep("Female",10))),Response=rnorm(400))
The next step is not necessary, but makes the subsequent code more readable.
trt = levels(dataset$Trt) sex = levels(dataset$Sex)
The following example is silly because you would rarely want to split your data as shown in this example, but (hopefully) it clearly illustrates the general idea of using paste( ) to create dynamic file names when writing files.
for (i in 1:length(trt)){ for (j in 1:length(sex)){ write.csv(subset(dataset, Trt==trt[i] & Sex==sex[j]), paste(trt[i],sex[j],".csv",sep=""), row.names=F) } }
The result of this loop is four CSV files: HighFemale.csv, HighMale.csv, LowFemale.csv, and LowMale.csv.
We can use the same basic idea to read those same four files into a single data frame. The key is to initialize an empty data frame and then append, via rbind( ), the data from each of the four files.
dataset2 = data.frame() for (i in 1:length(trt)){ for (j in 1:length(sex)){ dataset2 = rbind(dataset2,read.csv(paste(trt[i],sex[j],".csv",sep=""))) } }
I found this approach useful when I used a supercomputer to conduct many, many runs of an agent-based model. My jobs were queued more quickly on the supercomputer if they were small, so I broke my simulation experiments into many small jobs. This produced many files that I needed to combine into one data frame for analysis in R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.