Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In this post, I want to address the following issue: several data files with a common trame have to be dealt with by an R function. The function should export files (such as images or data files or any other file type). I explain how to create filenames such that the function automatically exports files in the same directory than the input file chosen by the user and how to customize the names of the exported files.
I thank Soraya with whom I’ve been looking at this problem (during her work placement) and who helps me find the answer (especially by pointing out the use of the function file.choose
).
Suppose that the following file (it is the famous iris data set):
ex-data.txt
is in a directory named
/home/tuxette/data1/
(for instance) and that you want to create a function extractNum
that has no input, make the user chose a dataset (this one for instance) and export two files (Rdata and csv formats) with only the numerical variables included in the original data set. The exported files must be saved in the same directory than the original file (whatever this directory is) and must be named from the original name by adding the post indication -num.Rdata
and -num.csv
(respectively).
The following function can be used to make the user chose a data set (that can be this data set but any other one also)
selectFile = function(){ file = file.choose() file } |
Then, start the function by making the user select the original data set. The function then load the data set and grepexpr
, substr
and paste
are used to create new filename as described above:
extractNum = function(){ # Make the user choose a file filename = selectFile() # Load the file d = read.table(filename,header=T) # Select numerical variables # (on the basis of the first observation only: might be improved) index.num = is.numeric(d[1,]) # Create new data set with only the numerical variables new.d = d[,index.num] # Extract from "filename" the pattern to export the new data set # (that is, everything before the final dot) pat = grepexpr("[.]",filename,grep=F) # (in our example, pat is 28 because 28 is the only dot in filename) pat = substr(filename,1,max(pat[[1]])-1) # (in our example, pat is then /home/tuxette/data1/ex-data) # Save the data in Rdata and csv formats at home/tuxette/data1/ex-data-num.Rdata # and home/tuxette/data1/ex-data-num.csv save(new.d,file=paste(pat,"-num.Rdata",sep="")) write.table(new.d,file=paste(pat,"-num.csv",sep=""),row.names=F) } |
In this file, note that the dot (pattern argument in the function grepexpr
) is a rationnal expression that has to be specified by “[.]” and not only “.”. Then just use:
extractNum() |
Write the link to the data set /home/tuxette/data1/ex-data.txt
and you should obtain two files with the numerical variables from the iris data set in the original directory of ex-data.txt
. Does it work?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.