[This article was first published on Recipes, scripts and genomics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRanges package in bioconductor). If you have the BEDtools suite installed and it is in your path, you can easily call BEDtools executables from R using the “system()” command. See BEDtools website to learn about BEDtools installation.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The function pasted below calls BEDtools executables on two data-frames. Both data-frames are structured as bed files. The function first writes out the data-frames as temporary files and calls the BEDtools programs on those temporary files and writes the output to another temporary file. In the end the function reads in the resulting file as a data frame and deletes the temporary files. I guess you can do this more elegantly with pipe() command in R, but that is left as an exercise to you: )
bedTools.2in<-function(functionstring="bedIntersect",bed1,bed2,opt.string="") { #create temp files a.file=tempfile() b.file=tempfile() out =tempfile() options(scipen =99) # not to use scientific notation when writing out #write bed formatted dataframes to tempfile write.table(bed1,file=a.file,quote=F,sep="\t",col.names=F,row.names=F) write.table(bed2,file=b.file,quote=F,sep="\t",col.names=F,row.names=F) # create the command string and call the command using system() command=paste(functionstring,"-a",a.file,"-b",b.file,opt.string,">",out,sep=" ") cat(command,"\n") try(system(command)) res=read.table(out,header=F) unlink(a.file);unlink(b.file);unlink(out) return(res) } >bed1 V1 V2 V3 V4 V5 V6 1 chr1 67051161 67052451 ENST00000371026 1 - 2 chr1 67060631 67060788 ENST00000371026 2 - 3 chr1 67065090 67065317 ENST00000371026 3 - 4 chr1 67066082 67066181 ENST00000371026 4 - 5 chr1 67071855 67071977 ENST00000371026 5 - 6 chr1 67072261 67072419 ENST00000371026 6 - >bedTools.2in("bedIntersect",bed1,bed2) # equivalent to "bedIntersect -a bed1 -b bed2" on terminal
To leave a comment for the author, please follow the link and comment on their blog: Recipes, scripts and genomics.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.