Entropy Based Image Binarization with imager and FSelectorRcpp
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The image processing and the computer vision have gained a significant interest in last 2 decades. The image analysis can be used to detect items or people on images and videos. It is widely used in the medicine to detect cancer tissues and to improve brain, lungs and heart diseases diagnostic. The computer automation enabled analyzing terabytes of an image data, based on which we improve our life status and get insights for business decisions. In this post I present basic operations that can be applied to a simple image, all thanks to imager package by which I am truly impressed. I also present a quick entropy approach to the image binarization, which applied to images on a greyscale transforms them to the binarized black-and-white output.
Warning: you will probably not be able to bear my face anymore after this post.
imager
Image processing library based on CImg
– this is the title of imager
package on CRAN. From the package’s overview, it’s website and a GREAT vignette: getting started with imager we can learn that it
contains a large array of functions for working with image data, with most of these functions coming from the CImg library by David Tschumperlé.
CImg is a simple, modern C++ library for image processing – imager
is a glue between the R interface and this library that works under the hood.
I must admit I read a lot of R vignettes, but I haven’t earlier seen such an accurate and comprehensive one before! Maybe RSelenium vignettes (about which I wrote in my previous post – Controlling Expenses on Ali Express with RSelenium) are written with as high sacrifice as those from imager
.
Basic image operations
Package provides a basic functionality allowing to load an image to R and to plot it with
Photos are stored as cimg
objects and are a special representation of a 4D array with dimensions called: x,y,z,c where x,y corresponds to spatial dimensions, z usually correspond to depth or time, and c is a colour. The third dimension is typically used for videos. If you are operating on a photo from a grayscale you will not need to pay the attention to the last 2 dimensions.
The advantage of storing images as arrays is that we can simply extract values corresponding to pixels to perform operations like arythemtic transformations or plotting the histogram of pixels’ intensity.
The rescale
parameter stands for whether you would like to store image in 0-255
or [0,1]
range, where for 0-255
picture are always rescaled to the maximum range. So to visualize the linear operations you need to set rescale=FALSE
.
Note the y axis running downwards: the origin is at the top-left corner, which is the traditional coordinate system for images. imager uses this coordinate system consistently
Image histogram and it’s equalisation
From wikipedia
An image histogram is a type of histogram that acts as a graphical representation of the tonal distribution in a digital image. It plots the number of pixels for each tonal value. By looking at the histogram for a specific image a viewer will be able to judge the entire tonal distribution at a glance.
It can be simply plotted for a grayscaled image with:
Another approach is to turn the image into a data.frame, and use ggplot2 to view all colour channels at once:
Histogram equalisation makes histograms flat: each pixel’s value is replaced by its rank, which is equivalent to running the data through their empirical cdf, which mainly solves the problem of over-representation for some pixel intensities.
Check the below comparison of an original photo, a grayscaled photo and a grayscaled photo after the histogram equalisation.
Plotting with ggplot2
By converting cimg
object to a data.frame
with as.data.frame
(mdf object) we can use following format to plot the image with ggplot2
. See following code with comments:
Note: I am using as.cimg
since the result of f
function is a regular vector, that needs to be converted to cimg
object for further usage.
Treshold binarization
For such a transformed image (a grayscaled photo after the histogram equalisation) next natural step in the image analysis is performing the image binarization. This is helpful in detecting the image’s background and foreground. In this process we will substitute pixel values with 0 and 1 so that in the end the output image will be only black and white. This format is also convenient for detecting edges or blob extractions.
With imager
we can perform the threshold binarization with treshold()
function that can provide a threshold with quantiles or with numeric values (or somehow automatic). Below is only example which is an introduction for the next chapter: entropy based image binarization
.
thr - a threshold, either numeric, or “auto”, or a string for quantiles
Entropy based image binarization
FSelectorRcpp
Last year Zygmunt Zawadzki (zzawadz) developed FSelectorRcpp package (still only available at GitHub - as I am still testing it, mea culpa…) which is a Rcpp (free of Java/Weka) implementation of FSelector entropy-based feature selection algorithms with sparse matrix support. It has a functionality allowing to calculate the conditional entropy for a variable, knowing values of another feature. This is a good measure of how one variable explains another one and can be used in the feature selection to reduce the curse of a dimensionality
or can be used in selecting the threshold value in the image binarization (my contrived idea).
Entropy and information gain
For the binarization I would choose the threshold that maximizes the information gain (based on the entropy) of binarized images based on the information gathered from the image before the binarization (a grayscaled photo after the histogram equalisation).
The information gain is defined as
$H(binarized) + H(pre_binarized) - H(binarized, pre_binarized),$
where $H(X)$ states for the Shannon’s entropy:
$H(X) = - \sum_{i=1}^{n} P(x_i)\log_bP(x_i),$
where $b$ is the base of the logarithm used (common values of $b$ are 2), and $X(A,B)$ is the conditional entropy for $A$ with a condition to $B$.
See an example of an information gain extraction for the binarized image with the threshold = 0.1.
Final binarization - threshold selection
The final step is to check all the possible thresholds (let’s assume that those are values 0-255 divided by 255). Since every threshold can be checked independently, I am using parallel
library and mclapply
function to apply the procedure to multiple cores.
Then I present the plot of calculated information gain vs the specified threshold. It’s surprising for me that this is almost symmetric over 0.5. Maybe this is due to the histogram equalisation. In my opinion this is the great tool to detect the background and the foreground of the image.
To sum up, I present my photo in a grayscale after the histogram equalisation and after the binarization based on the information gain optimization.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.