Plotting molecular properties for (sub)sets
[This article was first published on chem-bla-ics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
For a toxicology paper we are writing up, I need to create a few plots showing how the toxic and non-toxic molecules differ (or not) with respect to a few molecular properties, such as logP or the molecular weight. The rcdk package provides all, of course, except for a nice convenience method (or does it?) to make a plot. That is, I just want to do something like: Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
plot.propdist( mols, selections=list(all, actives, inactives), descriptor= "org.openscience.cdk.qsar.descriptors.molecular.WeightDescriptor", main="", xlab="Weight" )And now I can. The result looks something like:
The source code of my method (licensed MIT):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Copyright: 2011 Egon Willighagen <egonw@github> | |
# License: MIT | |
plot.propdist = function( | |
molecules=NULL, | |
selections=NULL, | |
plotColors=NULL, | |
descriptorValue=1, | |
descriptor="org.openscience.cdk.qsar.descriptors.molecular.WeightDescriptor", | |
... | |
) { | |
if (is.null(molecules)) stop("Data was not provided"); | |
if (!is.list(selections)) stop("No selections have been provided"); | |
if (!is.null(plotColors)) { | |
if (!is.vector(plotColors)) stop("The colors must be a vector"); | |
if (length(plotColors) != length(selections)) | |
stop("The number of selections and colors do not match"); | |
} | |
dframe = eval.desc(molecules, descriptor) | |
data = dframe[,descriptorValue] | |
maxx = max(data) | |
minx = min(data) | |
densities = lapply(selections, | |
function(x) { return(density(data[x])) } | |
) | |
maxy = max(unlist(lapply(densities, function(x) { return(max(x$y)) }))) | |
plot(densities[[1]], xlim=c(minx, maxx), ylim=c(0,maxy), type="n", ...) | |
if (is.null(plotColors)) plotColors = rainbow(3) | |
for (i in 1:length(densities)) { | |
lines(densities[[i]], col=plotColors[i]) | |
} | |
} |
To leave a comment for the author, please follow the link and comment on their blog: chem-bla-ics.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.