Site icon R-bloggers

K-means in images and R

[This article was first published on jkunst.com: Entries for category R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In an image each pixel is a color and that color hava a RGB representation, this means each color is represented by a triplet. For example the red in this representation is (250, 0, 0), black is (0,0,0) and white is (250, 250, 250) (see here more colors) . More generaly, each color is a point in the 3D cube [(0, 0, 0);(250, 250, 250)]. We’ll transform an image to a data.frame with 3 columns where each observation it’s a pixel. For make this we need to ReadImages and rgl libraries. Let’s take a look of the fist part of the script:

rm(list=ls())
library(ReadImages)
library(rgl)

# Read the image
image <- read.jpeg("images/nyiragongo-volcano-expedition-peter_52775_990x742.jpg")
str(image)
plot(image)

# Obtaing the size of image
H <- dim(image)[1]
W <- dim(image)[2]

# Creating the data frame
rgb_image <- data.frame(r = as.vector(image[1:H, 1:W, 1]),
                        g = as.vector(image[1:H, 1:W, 2]),
                        b = as.vector(image[1:H, 1:W, 3]))

# I prefer work whit the rgb transformation
rgb_image <- round(rgb_image*250)
head(rgb_image)

And we obtain the plot of te original image.

Now for each obervation we’ll obtain the rgb representation. Then we take a sample and plot those points with the respective color and this is the result.

rgb_image$colors_hex <- rgb(rgb_image, max = 255)
rgb_image_sample <- rgb_image[sample(1:nrow(rgb_image), size = 6000),]

with(rgb_image_sample,{
  plot3d(r, g, b, col = colors_hex, type='s', size=1, main = "Original Colors",
         xlab = "Red", ylab = "Green", zlab = "Blue")
})
movie3d(spin3d(axis=c(0,0,1)), duration=7, fps=10, movie = "colors", type = "gif")

There are black, blue and red points, and there are some yellow (red + green) points. Now we proceed to apply the k-mean algorithm over this points and plot the result.

kms <- kmeans(rgb_image_sample[,1:3], centers=3)
rgb_image_sample$color_kmeans <- rgb(kms$centers[kms$cluster,], max = 255)


with(rgb_image_sample,{
  plot3d(r, g, b, col = color_kmeans, type='s', size=1, main = "Color Clusters",
         xlab = "Red", ylab = "Green", zlab = "Blue", scale = 0.2)
})
movie3d(spin3d(axis=c(0,0,1)), duration=5, fps=40, movie = "colors_kmeans", type = "gif")

 

This is for illustrate the problem but now we apply for the entire image and see the result.

kms <- kmeans(rgb_image[,1:3], centers=3)
rgb_new <- kms$centers[kms$cluster, ]
# Back to the original representation (intensity between (0,1))
rgb_new <- rgb_new/250
image_new <- image

image_new[1:H, 1:W, 1] <- rgb_new[,1]
image_new[1:H, 1:W, 2] <- rgb_new[,2]
image_new[1:H, 1:W, 3] <- rgb_new[,3]

plot(image_new)

Finally we can loop over the k parameter and explore the evolution of the image.

Maybe this is not a very useful application but it’s a very good way to understand the idea of the k-means algorithm.

To leave a comment for the author, please follow the link and comment on their blog: jkunst.com: Entries for category R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.