Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Do you know my former blog? It contains four posts about the computation of the Kantorovich distance:
The Julia way using the JumP library has the most convenient syntax:
using JuMP
mu = [1/7, 2/7, 4/7]
nu = [1/4, 1/4, 1/2]
n = length(mu)
m = Model()
@defVar(m, p[1:n, 1:n] >= 0)
@setObjective(m, Min, sum{p[i, j], i in 1:n, j in 1:n; i != j})
for k in 1:n
@addConstraint(m, sum(p[k, :]) == mu[k])
@addConstraint(m, sum(p[:, k]) == nu[k])
end
solve(m)
This allows to get the Kantorovich distance between the two
probabilities mu and nu corresponding to the
0-1 distance (assuming mu and nu have the same
support). This is totally useless because one can straightforwardly get
this distance: it is one minus the total weight of the infimum measure
of the two probability measures (1 - sum(pmin(mu, nu)) in
R). But this is just for a simple illustration purpose. This problem is
not trivial for another distance on the support of mu and
nu. Encoding this distance as a matrix D, the
linear programming model allowing to get the corresponding Kantorovich
distance is obtained by replacing
sum{p[i, j], i in 1:n, j in 1:n; i != j}
with
sum{p[i, j] * D[i, j], i in 1:n, j in 1:n; i != j}
Now I want to show again how to compute the Kantorovich distance with R, but using another package I discovered yesterday: the ompr package. It allows to write the model with a convenient syntax, close to the mathematical language, similar to the one above with JumP. Here is the model:
library(ompr) library(ompr.roi) library(ROI.plugin.glpk) mu <- c(1/7, 2/7, 4/7) nu <- c(1/4, 1/4, 1/2) n <- length(mu) model <- MIPModel() |> add_variable(p[i, j], i = 1:n, j = 1:n, type = "continuous") |> add_constraint(p[i, j] >= 0, i = 1:n, j = 1:n) |> add_constraint(sum_over(p[i, j], j = 1:n) == mu[i], i = 1:n) |> add_constraint(sum_over(p[i, j], i = 1:n) == nu[j], j = 1:n) |> set_objective(sum_over(p[i, j], i = 1:n, j = 1:n, i != j), "min")
This is nicely readable. Now we solve the problem:
optimization <- model |> solve_model(with_ROI(solver = "glpk"))
And we get the Kantorovich distance:
objective_value(optimization) ## [1] 0.1071429
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
