Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Do you know my former blog? It contains four posts about the computation of the Kantorovich distance:
The Julia way using the JumP library has the most convenient syntax:
using JuMP mu = [1/7, 2/7, 4/7] nu = [1/4, 1/4, 1/2] n = length(mu) m = Model() @defVar(m, p[1:n, 1:n] >= 0) @setObjective(m, Min, sum{p[i, j], i in 1:n, j in 1:n; i != j}) for k in 1:n @addConstraint(m, sum(p[k, :]) == mu[k]) @addConstraint(m, sum(p[:, k]) == nu[k]) end solve(m)
This allows to get the Kantorovich distance between the two
probabilities mu
and nu
corresponding to the
0-1 distance (assuming mu
and nu
have the same
support). This is totally useless because one can straightforwardly get
this distance: it is one minus the total weight of the infimum measure
of the two probability measures (1 - sum(pmin(mu, nu))
in
R). But this is just for a simple illustration purpose. This problem is
not trivial for another distance on the support of mu
and
nu
. Encoding this distance as a matrix D
, the
linear programming model allowing to get the corresponding Kantorovich
distance is obtained by replacing
sum{p[i, j], i in 1:n, j in 1:n; i != j}
with
sum{p[i, j] * D[i, j], i in 1:n, j in 1:n; i != j}
Now I want to show again how to compute the Kantorovich distance with R, but using another package I discovered yesterday: the ompr package. It allows to write the model with a convenient syntax, close to the mathematical language, similar to the one above with JumP. Here is the model:
library(ompr) library(ompr.roi) library(ROI.plugin.glpk) mu <- c(1/7, 2/7, 4/7) nu <- c(1/4, 1/4, 1/2) n <- length(mu) model <- MIPModel() |> add_variable(p[i, j], i = 1:n, j = 1:n, type = "continuous") |> add_constraint(p[i, j] >= 0, i = 1:n, j = 1:n) |> add_constraint(sum_over(p[i, j], j = 1:n) == mu[i], i = 1:n) |> add_constraint(sum_over(p[i, j], i = 1:n) == nu[j], j = 1:n) |> set_objective(sum_over(p[i, j], i = 1:n, j = 1:n, i != j), "min")
This is nicely readable. Now we solve the problem:
optimization <- model |> solve_model(with_ROI(solver = "glpk"))
And we get the Kantorovich distance:
objective_value(optimization) ## [1] 0.1071429
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.