Site icon R-bloggers

R function to reverse and complement a DNA sequence

[This article was first published on Fabio Marroni's Blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Warning!!
This post is intended for documentation only. I would like to remind everyone (me in first place!) that the comp() function of the (seqinr) package can complement a DNA sequence, and rev() function of Rbase can reverse a character vector. Using a combination of the two you can reverse, complement, and reverse complement sequences as well.

Complements (and eventually reverse) a DNA sequence, which has to be inserted as a character vector, no matter if lower or uppercase.
Limitations:
1) Cannot work with RNA, only DNA
2) Cannot reverse without complementing. You can complement and reverse complement, but not just reverse.Author Fabio Marroni (http://www.fabiomarroni.altervista.org/)
Arguments:
x:character vector, the DNA sequence.
rev: logical. If TRUE, the function will return the reverse complemente, if FALSE, it will return the complementary sequence. The default value is TRUE.

Value:
The complemented (and eventually reverse) sequence, as a character vector.

There are several web sites which can easily complement and reverse a DNA sequence (and RNA as well).
The advantage of using this piece of code is that it is possible to automatically reverse complement a series of sequences: I had several primers to reverse/complement and I didn’t want to copy and paste them every time. Only now I found a web site in which you can copy and paste the primers on different lines and get the reverse complement of each primer on a different lines. You may want to try it: http://arep.med.harvard.edu/cgi-bin/adnan/revcomp.pl.
However, the versatility of R allows you to automatically retrieve the reverse complement and (for example) save each of the primer in a different text file.
Also, there is a nice library in R (seqinr) which can reverse complement and perform several other tasks (http://cran.r-project.org/web/packages/seqinr/index.html).

Since my R programming skills are “limited”, comments and suggestions are welcome!
 


rev.comp<-function(x,rev=TRUE)
{
x<-toupper(x)
y<-rep("N",nchar(x))
xx<-unlist(strsplit(x,NULL))
for (bbb in 1:nchar(x))
	{
		if(xx[bbb]=="A") y[bbb]<-"T"		
		if(xx[bbb]=="C") y[bbb]<-"G"		
		if(xx[bbb]=="G") y[bbb]<-"C"		
		if(xx[bbb]=="T") y[bbb]<-"A"
    }
if(rev==FALSE) 
	{
	for(ccc in (1:nchar(x)))
		{
		if(ccc==1) yy<-y[ccc] else yy<-paste(yy,y[ccc],sep="")
		}
	}
if(rev==T)
	{
	zz<-rep(NA,nchar(x))
	for(ccc in (1:nchar(x)))
		{
		zz[ccc]<-y[nchar(x)+1-ccc]
		if(ccc==1) yy<-zz[ccc] else yy<-paste(yy,zz[ccc],sep="")
		}
	}
	return(yy)	
}

Thanks to rhi for providing code for complementing without reversing. I paste it below.


convertToComplement<-function(x){
bases=c("A","C","G","T")
xx<-unlist(strsplit(toupper(x),NULL))
paste(unlist(lapply(xx,function(bbb){
if(bbb=="A") compString<-"T"
if(bbb=="C") compString<-"G"
if(bbb=="G") compString<-"C"
if(bbb=="T") compString<-"A"
if(!bbb %in% bases) compString<-"N"
return(compString)
})),collapse="")
}

To leave a comment for the author, please follow the link and comment on their blog: Fabio Marroni's Blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.