Testing for Linear Separability with Linear Programming in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
For the previous article I needed a quick way to figure out if two sets of points are linearly separable. But for crying out loud I could not find a simple and efficient implementation for this task. Except for the perceptron and SVM – both are sub-optimal when you just want to test for linear separability. The perceptron is guaranteed to finish off with a happy ending – if feasible – but it can take quite a while. And SVMs are designed for soft-margin classification which means that they might settle for a not separating plane and also they maximize the margin around the separating plane – which is wasted computational effort.
The efficient way to get the job done is by applying linear programming (LP). That means representing the question “Is it possible to fit a hyper-plane between two sets of points?” with a number of inequalities (that make up a convex area). I’m going to give a quick walk through for the math to make the idea plausible – but this text is more describing an introductory example and not an introduction to LP itself. For solving the linear program I will use Rglpk which provides a high level interface to the GNU Linear Programming Kit (GLPK) – and of course has been co-crafted by the man himself – Kurt Hornik – who is also involved with kernlab and party – thank you, Prof. Hornik and keep up the good work!
The Gory Details
Let’s say we have two sets and of points in :
And we want to know if there is a hyper-plane in which separates and then we can formulate the necessary condition with two symmetrical inequalities:
A and an exist, such that we can say for all \beta” title=”Rendered by QuickLaTeX.com” height=”19″ width=”124″ style=”vertical-align: -4px;”/> (1) and for all (2).
This is because a hyper-plane in can be defined as a vector
and the points on either side of it can be distinguished with above stated inequalities.
N <- 100 g <- expand.grid(x=0:N/N,y=0:N/N) # definition of the hyper plane h1 <- 3 h2 <- -4 beta <- -1.3 # points on either side g$col <- ifelse(h1 * g$x + h2 * g$y > beta, "cornflowerblue","darkolivegreen3") # roughly on the hyper plane g$col <- ifelse(abs(h1 * g$x + h2 * g$y - beta) < 2/N, "red", g$col) plot(g$x, g$y, col=g$col, pch=16, cex=.5, xlab="x", ylab="y", main="h(x,y) = 3 * x + (-4) * y + 1.3 = 0")
The conditions of a linear program are usually stated as a number of “weakly smaller than” inequalities. So lets transform (1) and (2) appropriately:
The conditions \beta" title="Rendered by QuickLaTeX.com" height="19" width="65" style="vertical-align: -4px;"/> and can be written as and . This is because we are dealing with finite sets and , so if we have a separating plane, then we can always fit in an such that and . If we now multiply both inequalities with , then we just end up with a different formulation for the same plane . The first inequality we additionally multiply with to turn into – and we and now we have:
(3)
(4)
Okay great – we’re almost there – now let’s get all the variables on the left hand side:
(5)
(6)
These hyper-plane conditions have to be true for all points. All points in have to fulfil (5) and all points in have to fulfil (6). Then this set of inequalities describes the convex set in of all possible separating hyper planes . Usually the description and purpose of a linear program does not stop at this point and the set of feasible solutions is used to maximize an objective function. In our case such an objective function might be introduced to maximize the distance of the plane from the points. But the article is long enough already and our objective is to just find a plane and not the the best plane. Which is why our objective function is going to be simply the
So now we formulate (5) and (6) in matrix notation because this is how LP solvers expect the program description to be fed to them – we get:
with
, ,
Example with Rglpk in two dimensions
library(Rglpk) N1 <- 3 N2 <- 3 # the points of sets A and B P <- matrix( runif(2*N1 + 2*N2,0,1), ncol=2,byrow=T ) # the matrix A defining the lhs of the conditions A <- cbind(P * c(rep(-1,N1),rep(1,N2)), c(rep(1,N1),rep(-1,N2))) # the objective function - no optimization necessary obj <- c(0,0,0) # the vector b defining the rhs of the conditions b <- rep(-1, N1+N2) # by default GLPK assums positive boundaries for the # variables. but we need the full set of real numbers. bounds <- list( lower = list(ind = c(1,2,3), val = c(-Inf,-Inf,-Inf)), upper = list(ind = c(1,2,3), val = c(Inf,Inf,Inf)) ) # solving the linear program s <- Rglpk_solve_LP(obj, A, rep("<=", N1+N2), b, bounds=bounds) plot(P,col=c(rep("red",N1),rep("blue",N2)), xlab="x", ylab="y", cex=1, pch=16, xlim=c(0,1), ylim=c(0,1)) # status 0 means that a solution was found if(s$status == 0) { h1 = s$solution[1] h2 = s$solution[2] beta = s$solution[3] # drawing the separating line if(h2 != 0) { abline(beta/h2,-h1/h2) } else { abline(v=-beta/h1) } } else { cat("Not linearly separable.") }
In case you are wondering how I managed to include all those pretty pretty math formulas in this post – I am using the QuickLaTeX WordPress plug-in and I must say I really like the result. In previous posts I used a LaTeX web editor and then included the rendered formulas as an image.
The post Testing for Linear Separability with Linear Programming in R appeared first on joy of data.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.