Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Besides Deep Learning (in the realm of Data Science and AI) there is another scientific and applied area where people always seem to prefer Python over R and this is: Optimization (in the realm of Decision Science). Luckily there are two very good optimization modeling frameworks for R available, namely CVXR and ompr.
If you require a quick refresher on general Optimization and R please refer to my tutorial Decision Optimization 101 on my site “Decision Optimization with R” – this tutorial contains unique content on how to model a linear optimization model using four different packages – using the two above mentioned modeling languages CVXR and ompr as well as the two matrix-based systems ROI and modopt.matlab. Refer to the references at the end of the tutorial for further details.
While I am an outspoken advocate of optimization modeling language frameworks and tell everyone who is interested in learning optimization to start with modeling environments it is still convenient to model some optimization problems matrix-wise – especially if you put one specific model into production. My (large) production models additionally use Rcpp to build the matrix to gain speed as this creation is often the most time consuming task in the whole solution process.
Finally, after years of sitting on GitHub only I submitted the package modopt.matlab to CRAN this year in August 2018. This happened mainly due to the fact that the package ROI (R Optimization Interface) became a stable product due to the great work of Florian Schwendinger. Before that modopt.matlab was using its own optimization solver interface which had severe shortcomings.
modopt.matlab basically transforms an optimization model specified in MatLab syntax into an equivalent ROI syntax which I admittedly personally find difficult to deal with. Having modeled countless models with MatLab I really learnt to appreciate the three simple yet powerful functions linprog, quadprog and intlinprog included in the official Optimization Toolbox from MathWorks, Inc. Thus the package provides simplicity and a clean syntax as well as the (underlying) power of ROI.
Linear Programming
Consider the following tiny (linear) optimization example with two variables x1 and x2 and two inequality constraints.
# maximize: 2x1 + x2 # subject to: # x1 + x2 <= 5 # x1 <= 3 # x1 >= 0, x2 >= 0
We just have to specify the vector for the objective function which is called f below as well as the left-hand side matrix A as well as right-hand side vector b of the constraints.
f <- c(2, 1) A <- matrix(c(1, 1, 1, 0), nrow=2, byrow=TRUE) b <- c(5, 3)
As the canonical optimization model formulation is a minimization which is why we should not forget to put a minus sign in front of the objective function vector f.
sol <- linprog(-f, A, b) sol$x
Quadratic Programming
The classical application example in the field of quadratic programming is undoubtly the risk-reward portfolio optimization as specified by Markowitz (1952). Here we are minimizing the variance (risk) of our portfolio without constraining the return, i.e. we compute the so-called Minimum Variance Portfolio (MVP). First we need some real-world data. We use weekly returns from the year 2011 of four technology stocks AAPL, IBM, MSFT, and ORCL and receive the following covariance matrix, see below:
stocks <- c("AAPL", "IBM", "MSFT", "ORCL")
n_assets <- length(stocks)
covariance <- matrix(c(
0.0014708114, 0.0006940036, 0.0006720841, 0.0008276391, 
0.0006940036, 0.0009643581, 0.0006239411, 0.0011266429, 
0.0006720841, 0.0006239411, 0.0009387707, 0.0008728736, 
0.0008276391, 0.0011266429, 0.0008728736, 0.0021489512), 
nrow=n_assets, byrow=TRUE)
colnames(covariance) <- rownames(covariance) <- stocks
Next we specify the optimization model vectors and matrices. For more details about the formulation please see the tutorial “Efficient Frontiers” on Become a Quant with R (free registration required).
H <- covariance # quadratic objective f <- rep(0, n_assets) # no linear part of objective Aeq <- rep(1, n_assets) # budget normalization, lhs beq <- 1 # budget normalization, lhs lb <- rep(0, n_assets) # lower bound (long-only portfolio) ub <- rep(1, n_assets) # upper bound
Now we are ready to solve the model and extract the solution which represents the optimal investment portfolio as defined by Markowitz.
solution <- quadprog(H, f, NULL, NULL, Aeq, beq, lb, ub) portfolio <- solution$x names(portfolio) <- stocks print(portfolio)
Final Remark
It is my sincere hope that you enjoy the package and that it convinces some of you to move to R when doing optimization. Maybe my online tutorials at “Decision Optimization with R” additionally supports this task.
If you have any questions please feel free to get back to me anytime. Happy Optimizing!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
