Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In my previous post, I motivated a web application that would allow small-scale sustainable meat producers to sell directly to consumers using a meat share approach, using constrained optimization techniques to maximize utility for everyone involved. In this post, I’ll walk through some R code that I wrote to demonstrate the technique on a small scale.
Although the problem is set up in R, the actual mathematical optimization is done by Symphony, an open-source mixed-integer solver that’s part of the COIN-OR project. (The problem of optimizing assignments, in this case of cuts of meat to people, is an integer planning problem, because the solution involves assigning either 0 or 1 of each cut to each person. More generally, linear programming and related optimization frameworks allow solving for real-numbered variables.) The RSymphony package allows problems set up in R to be solved by the C/C++ Symphony code with little hassle.
My code is in a public github repository called groupmeat-demo, and the demo code discussed here is in the subset_test.R file. (The other stuff in the repo is an unfinished version of a larger-scale demo with slightly more realistic data.)
For this toy problem, we want to optimally assign 6 items to 3 people, each of whom have a different utility (value) for each item. In this case, I’m ignoring any fixed utility, such as cost in dollars, but that could be added into the formulation. Additionally, assume that items #1 and #2 cannot both be assigned, as with pork loin and pork chops.
This sort of problem is fairly simple to define mathematically. To set up the problem in code, I’ll need to create some matrices that are used in the computation. Briefly, the goal is to maximize an objective expression,
Here’s what the
pers1 pers2 pers3 item1 0.467 0.221 0.2151 item2 0.030 0.252 0.4979 item3 0.019 0.033 0.0304 item4 0.043 0.348 0.0158 item5 0.414 0.050 0.0096 item6 0.029 0.095 0.2311
It appears as if everyone like item1, but only person1 likes item5.
Additionally, I need to define some constraints. For starters, it makes no sense to assign an item to more than one person. So, for each row of that matrix, the sum of the variables (not the utilities) must be 1, or maybe 0 (if that item is not assigned). I’ll create a constraint matrix, where each row contains 18 columns, and the pattern of 0′s and 1′s defines a row of the assignment matrix. Since there are 6 items, there are 6 rows (for now). Each row needs to be less than or equal to one (I’ll tell the solver to use integers only later), so I also define vectors of inequality symbols and right-hand-sides.
# for each item/row, enforce that the sum of indicators for its assignment are <= 1 mat <- laply(1:num.items, function(ii) { x <- mat.0; x[ii, ] <- 1; as.double(x) }) dir <- rep('<=', num.items) rhs <- rep(1, num.items) |
To add the loin/chops constraint, I need to add another row, specifying that the sum of the indicators for both rows now must be 1 or less as well.
# for rows 1 and 2, enforce that the sum of indicators for their assignments are <= 1 mat <- rbind(mat, matrix(matrix(c(1, 1, rep(0, num.items-2)), nrow=num.items, ncol=num.pers), nrow=1)) dir <- c(dir, '<=') rhs <- c(rhs, 1) |
Here’s what those matrices and vectors look like:
> mat 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [1,] 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 [2,] 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 [3,] 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 [4,] 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 [5,] 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 [6,] 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 [7,] 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 > dir [1] "
Finally, specify that the variables must be binary (0 or 1), and call SYMPHONY to solve the problem:
# this is an IP problem, for now types <- rep('B', num.items * num.pers) max <- TRUE # maximizing utility soln <- Rsymphony_solve_LP(obj, mat, dir, rhs, types=types, max=max) |
And, with a bit of post-processing to recover matrices from vectors, here’s the result:
$solution [1] 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 0 1 $objval [1] 1.52 $status TM_OPTIMAL_SOLUTION_FOUND 0 Person #1 got Items 5 worth 0.41 Person #2 got Items 3, 4 worth 0.38 Person #3 got Items 2, 6 worth 0.73
So that’s great. It found an optimal solution worth more than 50% more than the expected value of a random assignment. But there’s a problem. There’s no guarantee that everyone gets anything, and in this case, person #3 gets almost twice as much utility as person #2. Unfair! We need to enforce an additional constraint, that the difference between the maximum utility that any one person gets and the minimum utility that any one person gets is not too high. This is sometimes called a parity constraint. Adding parity constraints is a little tricky, but the basic idea here is to add two more variables to the 18 I’ve already defined. These variables are positive real numbers, and they are forced by constraints to be the maximum and minimum total utilities per person. In the objective function, then, they are weighted so that their difference is not to big. So, that expression becomes:
For the existing rows of the constraint matrix, these new variables get 0′s. But two more rows need to be added, per person, to force their values to be no bigger/smaller (and thus the same as) the maximum/minimum of any person’s assigned utility.
# now for those upper and lower variables # \forall p, \sum_i u_i x_{i,p} - d.upper \le 0 # \forall p, \sum_i u_i x_{i,p} - d.lower \ge 0 # so, two more rows per person d.constraint <- function(iperson, ul) { # ul = 1 for upper, 0 for lower x <- mat.utility.0 x[, iperson ] <- 1 x <- x * obj.utility c(as.double(x), (if (ul) c(-1,0) else c(0,-1))) } mat <- rbind(mat, maply(expand.grid(iperson=1:num.pers, ul=c(1,0)), d.constraint, .expand=FALSE)) dir <- c(dir, c(rep('<=', num.pers), rep('>=', num.pers))) rhs <- c(rhs, rep(0, num.pers*2)) |
The constraint inequalities then becomes as follows:
> print(mat, digits=2) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1.00 0.000 0.000 0.000 0.00 0.000 1.00 0.00 0.000 0.00 0.00 0.000 1.00 0.0 0.000 0.000 0.0000 0.00 0 0 0.00 1.000 0.000 0.000 0.00 0.000 0.00 1.00 0.000 0.00 0.00 0.000 0.00 1.0 0.000 0.000 0.0000 0.00 0 0 0.00 0.000 1.000 0.000 0.00 0.000 0.00 0.00 1.000 0.00 0.00 0.000 0.00 0.0 1.000 0.000 0.0000 0.00 0 0 0.00 0.000 0.000 1.000 0.00 0.000 0.00 0.00 0.000 1.00 0.00 0.000 0.00 0.0 0.000 1.000 0.0000 0.00 0 0 0.00 0.000 0.000 0.000 1.00 0.000 0.00 0.00 0.000 0.00 1.00 0.000 0.00 0.0 0.000 0.000 1.0000 0.00 0 0 0.00 0.000 0.000 0.000 0.00 1.000 0.00 0.00 0.000 0.00 0.00 1.000 0.00 0.0 0.000 0.000 0.0000 1.00 0 0 1.00 1.000 0.000 0.000 0.00 0.000 1.00 1.00 0.000 0.00 0.00 0.000 1.00 1.0 0.000 0.000 0.0000 0.00 0 0 0.47 0.030 0.019 0.043 0.41 0.029 0.00 0.00 0.000 0.00 0.00 0.000 0.00 0.0 0.000 0.000 0.0000 0.00 -1 0 0.00 0.000 0.000 0.000 0.00 0.000 0.22 0.25 0.033 0.35 0.05 0.095 0.00 0.0 0.000 0.000 0.0000 0.00 -1 0 0.00 0.000 0.000 0.000 0.00 0.000 0.00 0.00 0.000 0.00 0.00 0.000 0.22 0.5 0.030 0.016 0.0096 0.23 -1 0 0.47 0.030 0.019 0.043 0.41 0.029 0.00 0.00 0.000 0.00 0.00 0.000 0.00 0.0 0.000 0.000 0.0000 0.00 0 -1 0.00 0.000 0.000 0.000 0.00 0.000 0.22 0.25 0.033 0.35 0.05 0.095 0.00 0.0 0.000 0.000 0.0000 0.00 0 -1 0.00 0.000 0.000 0.000 0.00 0.000 0.00 0.00 0.000 0.00 0.00 0.000 0.22 0.5 0.030 0.016 0.0096 0.23 0 -1 > dir [1] "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" "<=" ">=" "<=" "<=" > rhs [1] 1 1 1 1 1 1 1 0 0 0 0 0 0 |
Looking at just the last row, this constraint says that the sum of the utilities of any assigned items for person #3, minus the lower limit, must be at least 0. That is essentially the definition of the lower limit, that that constraint holds true for all three people in this problem. Similar logic applies for the upper limit.
Running the solver with this set of inputs gives the following:
$solution [1] 0.000 0.000 1.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 1.000 0.000 1.000 0.000 0.000 0.000 [18] 0.000 0.498 0.433 $objval [1] 1.31 $status TM_OPTIMAL_SOLUTION_FOUND 0 Person #1 got Items 3, 5 worth 0.43 Person #2 got Items 4, 6 worth 0.44 Person #3 got Items 2 worth 0.50 |
The last two numbers in the solution are the values of the upper and lower bounds. Note that the objective value is only 41% higher than a random assignment, but the utilities assigned to each person are much closer. Dropping the
Scaling this up to include constraints in pricing, farm preferences, price vs. preference meta-preferences, etc., is not conceptually difficult, but would just entail careful programming. It is left as an exercise for the well-motivated reader!
If you’ve made it this far, I’d definitely appreciate any feedback about this idea, corrections to my formulation or code or terminology, etc!
(Thanks to Paul Ruben and others on OR-Exchange, who helped me figure out how to think about the parity problem, and to the authors of WP-codebox and WP LaTeX for giving me tools to put nice scrollable R code and math in this post!)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.