R: k-Means Clustering on an Image

[This article was first published on Analysis with Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Enough with the theory we recently published, let’s take a break and have fun on the application of Statistics used in Data Mining and Machine Learning, the k-Means Clustering.
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. (Wikipedia, Ref 1.)
We will apply this method to an image, wherein we group the pixels into k different clusters. Below is the image that we are going to use,
Colorful Bird From Wall321
We will utilize the following packages for input and output:
  1. jpeg – Read and write JPEG images; and,
  2. ggplot2 – An implementation of the Grammar of Graphics.

Download and Read the Image

Let’s get started by downloading the image to our workspace, and tell R that our data is a JPEG file.

# Load the package
library(jpeg)
url <- "http://www.wall321.com/thumbnails/detail/20120304/colorful%20birds%20tropical%20head%203888x2558%20wallpaper_www.wall321.com_40.jpg"
# Download the file and save it as "Image.jpg" in the directory
dFile <- download.file(url, "Image.jpg")
img <- readJPEG("Image.jpg") # Read the image
view raw kMeansImg1.R hosted with ❤ by GitHub

Cleaning the Data

Extract the necessary information from the image and organize this for our computation:

# Obtain the dimension
imgDm <- dim(img)
# Assign RGB channels to data frame
imgRGB <- data.frame(
x = rep(1:imgDm[2], each = imgDm[1]),
y = rep(imgDm[1]:1, imgDm[2]),
R = as.vector(img[,,1]),
G = as.vector(img[,,2]),
B = as.vector(img[,,3])
)
view raw kMeansImg2.R hosted with ❤ by GitHub
The image is represented by large array of pixels with dimension rows by columns by channels — red, green, and blue or RGB.

Plotting

Plot the original image using the following codes:

library(ggplot2)
# ggplot theme to be used
plotTheme <- function() {
theme(
panel.background = element_rect(
size = 3,
colour = "black",
fill = "white"),
axis.ticks = element_line(
size = 2),
panel.grid.major = element_line(
colour = "gray80",
linetype = "dotted"),
panel.grid.minor = element_line(
colour = "gray90",
linetype = "dashed"),
axis.title.x = element_text(
size = rel(1.2),
face = "bold"),
axis.title.y = element_text(
size = rel(1.2),
face = "bold"),
plot.title = element_text(
size = 20,
face = "bold",
vjust = 1.5)
)
}
# Plot the image
ggplot(data = imgRGB, aes(x = x, y = y)) +
geom_point(colour = rgb(imgRGB[c("R", "G", "B")])) +
labs(title = "Original Image: Colorful Bird") +
xlab("x") +
ylab("y") +
plotTheme()
view raw kMeansImg3.R hosted with ❤ by GitHub

Clustering

Apply k-Means clustering on the image:

kClusters <- 3
kMeans <- kmeans(imgRGB[, c("R", "G", "B")], centers = kClusters)
kColours <- rgb(kMeans$centers[kMeans$cluster,])
view raw kMeansImg4.R hosted with ❤ by GitHub
Plot the clustered colours:

ggplot(data = imgRGB, aes(x = x, y = y)) +
geom_point(colour = kColours) +
labs(title = paste("k-Means Clustering of", kClusters, "Colours")) +
xlab("x") +
ylab("y") +
plotTheme()
view raw kMeansImg5.R hosted with ❤ by GitHub
Possible clusters of pixels on different k-Means:

Originalk = 6
Table 1: Different k-Means Clustering.
k = 5k = 4
k = 3k = 2

I suggest you try it!

Reference

  1. K-means clustering. Wikipedia. Retrieved September 11, 2014.

To leave a comment for the author, please follow the link and comment on their blog: Analysis with Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)