R function to reverse and complement a DNA sequence
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Warning!!
This post is intended for documentation only. I would like to remind everyone (me in first place!) that the comp() function of the (seqinr) package can complement a DNA sequence, and rev() function of Rbase can reverse a character vector. Using a combination of the two you can reverse, complement, and reverse complement sequences as well.
Complements (and eventually reverse) a DNA sequence, which has to be inserted as a character vector, no matter if lower or uppercase.
Limitations:
1) Cannot work with RNA, only DNA
2) Cannot reverse without complementing. You can complement and reverse complement, but not just reverse.Author Fabio Marroni (http://www.fabiomarroni.altervista.org/)
Arguments:
x:character vector, the DNA sequence.
rev: logical. If TRUE, the function will return the reverse complemente, if FALSE, it will return the complementary sequence. The default value is TRUE.
Value:
The complemented (and eventually reverse) sequence, as a character vector.
There are several web sites which can easily complement and reverse a DNA sequence (and RNA as well).
The advantage of using this piece of code is that it is possible to automatically reverse complement a series of sequences: I had several primers to reverse/complement and I didn’t want to copy and paste them every time. Only now I found a web site in which you can copy and paste the primers on different lines and get the reverse complement of each primer on a different lines. You may want to try it: http://arep.med.harvard.edu/cgi-bin/adnan/revcomp.pl.
However, the versatility of R allows you to automatically retrieve the reverse complement and (for example) save each of the primer in a different text file.
Also, there is a nice library in R (seqinr) which can reverse complement and perform several other tasks (http://cran.r-project.org/web/packages/seqinr/index.html).
Since my R programming skills are “limited”, comments and suggestions are welcome!
rev.comp<-function(x,rev=TRUE) { x<-toupper(x) y<-rep("N",nchar(x)) xx<-unlist(strsplit(x,NULL)) for (bbb in 1:nchar(x)) { if(xx[bbb]=="A") y[bbb]<-"T" if(xx[bbb]=="C") y[bbb]<-"G" if(xx[bbb]=="G") y[bbb]<-"C" if(xx[bbb]=="T") y[bbb]<-"A" } if(rev==FALSE) { for(ccc in (1:nchar(x))) { if(ccc==1) yy<-y[ccc] else yy<-paste(yy,y[ccc],sep="") } } if(rev==T) { zz<-rep(NA,nchar(x)) for(ccc in (1:nchar(x))) { zz[ccc]<-y[nchar(x)+1-ccc] if(ccc==1) yy<-zz[ccc] else yy<-paste(yy,zz[ccc],sep="") } } return(yy) }
Thanks to rhi for providing code for complementing without reversing. I paste it below.
convertToComplement<-function(x){ bases=c("A","C","G","T") xx<-unlist(strsplit(toupper(x),NULL)) paste(unlist(lapply(xx,function(bbb){ if(bbb=="A") compString<-"T" if(bbb=="C") compString<-"G" if(bbb=="G") compString<-"C" if(bbb=="T") compString<-"A" if(!bbb %in% bases) compString<-"N" return(compString) })),collapse="") }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.