Image marginal histograms

[This article was first published on R – Irregularly Scheduled Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Another day, another interesting challenge.

I follow Bob Rudis’ (a.k.a. hrbrmstr’s) blog, typically via R-bloggers, and this post caught my eye. Partly because I thought I knew of an existing way to do this. As usual, actually getting that to work took a little longer than I might have hoped, but I think the end result is pretty neat.

His post describes the process of writing an R function to take an image file, for example this one

file10a566a2b4dc3

and producing a histogram along the sides of the number of pixels on a given row/column. This is what he created (a different image to the example, I believe)

Something funny is going on with the right-hand histogram; it doesn’t line up with the image.

Here’s my approach.

setwd("WORKINGDIRECTORY")
library(png) ## split .png into component matrices
library(RSAGA) ## grid.to.xyz, load before dplyr
library(dplyr) ## group_by(), summarise(), %>%
library(ggplot2) ## plotting
library(ggExtra) ## ggMarginal
## load the .png file
img <- png::readPNG("file10a566a2b4dc3.png")
## convert each channel into a data.frame
imgdf1 <- RSAGA::grid.to.xyz(img[,,1]) %>% mutate(col="red") # red channel
imgdf2 <- RSAGA::grid.to.xyz(img[,,2]) %>% mutate(col="green") # green channel
imgdf3 <- RSAGA::grid.to.xyz(img[,,3]) %>% mutate(col="blue") # blue channel
## collect these together to average
imgdf <- bind_rows(imgdf1, imgdf2, imgdf3)
## shift the coordinates to the correct values, and subtract the pixel values from 1 (white=0)
imgdf$x <- imgdf$x + 1
imgdf$y <- imgdf$y + 1
imgdf$z <- 1 - imgdf$z
## average the three channels (median) at each pixel
imgdf %<>% group_by(x, y) %>% summarise(z=median(z))
## plot the pixels, removing white pixels (z==0)
p <- ggplot(imgdf %>% filter(z!=0), aes(x=x, y=y)) + geom_point(size=0.1, pch=19) +
theme(panel.background=element_blank(),
line=element_blank(),
axis.text=element_blank(),
axis.title=element_blank(),
plot.margin=unit(c(0,0,0,0), "lines"),
complete=TRUE)
## add a marginal histogram of each pixel
ggExtra::ggMarginal(p, type="histogram", binwidth=1, col="darkturquoise")
view raw image_density.R hosted with ❤ by GitHub

It leverages the png package to extract the channels into a matrix, converts those to x,y,z data.frames, takes the median value, plots that with ggplot2, then leverages ggExtra::ggMarginal to add the marginal histograms. Note that the ggExtra package has some bugs (it hasn’t been maintained in a while) in relation to more recent (possibly the dev branch) of ggplot2. I got it working on at least one of my machines. This is my result

I’ve had several uses for these types of marginal plots lately, so hopefully I can sort out the issues I’ve been getting in combination with ggplot2.

To leave a comment for the author, please follow the link and comment on their blog: R – Irregularly Scheduled Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)