The easiest way to get UTR sequence
[This article was first published on YGC » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I just figure out the way to query UTR sequence from ensembl by biomart tool.
It is very simple compare with using bioperl to parse gbk file to extract UTR sequence.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | require(biomaRt) require(org.Hs.eg.db) ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") eg <- mappedkeys(org.Hs.egGO) utr <- getSequence(id=eg, type="entrezgene", seqType="3utr", mart=ensembl) outfile <- file("human-3utr.fa", "w") for (i in 1:nrow(utr)) { h = paste(c(">", utr[i,2]), collapse="") writeLines(h, outfile) writeLines(utr[i,1], outfile) } close(outfile) |
Related Posts
To leave a comment for the author, please follow the link and comment on their blog: YGC » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.