CpG Island shelves
[This article was first published on Chitka, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Nowadays, regions with relatively lower CpG density are gaining importance in DNA methylation studies. This is based on the fact that a majority of CpG rich regions (CpG islands) are non-dynamic and less variant in terms of methylation status probed across a variety of tissues and cell populations (Irizarry 2009, Ziller 2013). It is now proven that methylation is more dynamic along the CpG shores (< 2kb flanking CpG Islands) and CpG shelves (<2kb flanking outwards from a CpG shore). While it is easy to retreive the genomic co-ordinates of CpG Islands from UCSC browser, public resources for retrieving the genomic co-ordinates for shores and shelfs were missing or scarce. Here, I am displaying the code (in R using Bioconductor package GenomicRanges) I use for generating objects for CpG islands, CpG island shores and CpG island shelves. The resulting objects are GRanges objects which could be used in multiple downstream applications in Bioconductor related packages.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This script will generate the cpg_islands, shores and shelves as a granges list. | |
# updated on 6th june 2014 | |
# eqalizes seqinfo of the cpg objects to hg19 objects (removes minor chromosomes, X, Y) | |
# Load library, hg19 object, read cpg_island bed file | |
library(rtracklayer) | |
load("/home/kalyankpy/hg19/hg19_gr.rda") | |
source("/home/kalyankpy/Dropbox/myscripts/r_scripts/df2gr_with_seqinfo.r") | |
cpg_islands=read.table("/home/kalyankpy/reg_elements/cpg_island.bed") | |
# name the columns | |
names(cpg_islands)<-c("seqnames","start","end","name") | |
#replace the column labelled name | |
cpg_islands$name<-"island" | |
cpg_islands$start=cpg_islands$start+1 # convert 0 based start in bedfile into 1 based start | |
# equalize the seqinfo | |
cpg_islands<-df2gr(cpg_islands,hg19_gr) | |
############################################################### | |
# Extract CpG island shores | |
############################################################### | |
# extract the shore defined by 2000 bp upstream of cpg islands | |
shore1=flank(cpg_islands, 2000) | |
# extract the shore defined by 2000 bp downstream of cpg islands | |
shore2=flank(cpg_islands,2000,FALSE) | |
# perform intersection and combine the shores where they overlap | |
shore1_2=reduce(c(shore1,shore2)) | |
# extract the features (ranges) that are present in shores only and not in cpg_islands (ie., shores not overlapping islands) | |
cpgi_shores=setdiff(shore1_2, cpg_islands) | |
cpgi_shores$name="shore" | |
############################################################### | |
# Extract CpG island shelves | |
############################################################### | |
# extract the shore defined by 4000 bp upstream of cpg islands | |
shelves1=flank(cpg_islands, 4000) | |
# extract the shore defined by 2000 bp downstream of cpg islands | |
shelves2=flank(cpg_islands,4000,FALSE) | |
# perform intersection and combine the shelves where they overlap | |
shelves1_2=reduce(c(shelves1,shelves2)) | |
# create a set of ranges consisting CpG Islands, Shores | |
island_shores=c(cpg_islands, cpgi_shores) | |
# extract the features (ranges) that are present in shelves only and not in cpg_islands or shores(ie., shelves not overlapping islands or shores) | |
cpgi_shelves=setdiff(shelves1_2, island_shores) | |
cpgi_shelves$name="shelf" | |
################################################################# | |
# Export the GRanges files into bed for visualization | |
################################################################# | |
cpg_elements_gr=GRangesList("islands"=cpg_islands, "shores"=cpgi_shores, "shelves"=cpgi_shelves) | |
save(cpg_elements_gr, file="/home/kalyankpy/reg_elements/cpg_element_gr.rda") |
To leave a comment for the author, please follow the link and comment on their blog: Chitka.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.