Site icon R-bloggers

Solving Quadratic Progams with R’s quadprog package

[This article was first published on quantitate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In this post, we’ll explore a special type of nonlinear constrained optimization problems called quadratic programs. Quadratic programs appear in many practical applications, including portfolio optimization and in solving support vector machine (SVM) classification problems. There are several packages available to solve quadratic programs in R. Here, we’ll work with the quadprog package. Before we dive into some examples with quadprog, we’ll give a brief overview of the terminology and mechanics of quadratic programs.

Quick Overview of Quadratic Programming

In a quadratic programming problem, we consider a quadratic objective function: $$ Q(x) = \frac{1}{2} x^T D x – d^T x + c.$$ Here, $x$ is a vector in $\mathbb{R}^n$, $D$ is an $n \times n$ symmetric positive definite matrix, $d$ is a constant vector in $\mathbb{R}^n$ and $c$ is a scalar constant. The function $Q(x)$ is sometimes referred to as a quadratic form and it generalizes the quadratic function $q(x) = ax^2 +bx + c$ to higher dimensions. The key feature to note about $Q(x)$ is that it is a convex function.

We also impose a system of linear constraints on the vector $x \in \mathbb{R}^n$. We write these constraints in the form $$ Ax = f \qquad Bx \geq g.$$  Here $A$ is an $m_1 \times n$ matrix with $m_1 \leq n$ and $B$ is a $m_2 \times n$ matrix.  The vectors $f$ and $g$ have lengths $m_1$ and $m_2$ respectively.  This specification is general enough to allow us to consider a variety of practical conditions on the components of $x$, e.g. we can force $x$ to satisfy the sum condition $$ \sum_{i=1}^n x_i = 1 $$ or the box constraints $a_i \leq x_i \leq b_i$. We’ll describe how to encode practical constraint conditions into matrix systems below.

With this notation out of the way, we can write the quadratic program (QP) compactly as: $$ \left\{ \begin{array}{l} \mathrm{minimize}_{x \in \mathbb{R}^n}: \qquad Q(x) = \frac{1}{2} x^T D x – d^T x + c \\ \mathrm{subject\; to:} \qquad Ax = f \qquad Bx \geq g \end{array} \right.$$

Example #1:

Consider the objective function: $$ \begin{eqnarray*} Q(x,y) &=& \frac{1}{2} \left[ \begin{matrix} x & y \end{matrix} \right] \left[ \begin{matrix} 2 & -1 \\ -1 & 2 \end{matrix} \right]\left[ \begin{matrix} x \\ y \end{matrix} \right] – \left[ \begin{matrix} -3 & 2 \end{matrix} \right] \left[ \begin{matrix} x \\ y \end{matrix} \right] + 4 \\ &=& x^2 + y^2 -xy +3x -2y + 4 . \end{eqnarray*}$$ We seek to minimize this function over the triangular region $$\begin{eqnarray*} y &\geq& 2 – x \\ y &\geq& -2 + x \\ y &\leq& 3. \end{eqnarray*}$$
We can find the vertices of this triangle and plot the region in R:
plot(0, 0, xlim = c(-2,5.5), ylim = c(-1,3.5), type = "n", 
     xlab = "x", ylab = "y", main="Feasible Region")
polygon(c(2,5,-1), c(0,3,3), border=TRUE, lwd=4, col="blue")

To solve this QP with the quadprog library, we’ll need to translate both our objective function and the constraint system into the matrix formulation required by quadprog. From the quadprog documentation
This routine implements the dual method of Goldfarb and Idnani (1982, 1983) for solving quadratic programming problems of the form min(-d^T b + 1/2 b^T D b) with the constraints A^T b >= b_0.
It’s not hard to see how to put the quadratic form $Q(x,y)$ into the correct matrix form for the objective function of quadprog (but be sure to note the sign pattern and factors of two).  First we observe that, for any constant $c$, the minimizer of $Q(x,y)+c$ is the same as the minimizer of $Q(x,y)$. We can therefore ignore all constant terms in the quadratic form $Q(x,y)$. We set:
$$D =  \left[ \begin{matrix} 2 & -1 \\ -1 & 2 \end{matrix} \right] \qquad  d = \left[ \begin{matrix} -3 \\ 2 \end{matrix} \right]. $$
We can write the constraint equations in the form:
$$\left[ \begin{matrix} 1 & 1 \\ -1 & 1 \\ 0 & -1 \end{matrix} \right] \left[ \begin{matrix} x \\ y \end{matrix} \right] \geq \left[ \begin{matrix} 2 \\ -2 \\ -3 \end{matrix} \right]$$ so that $$A = \left[ \begin{matrix} 1 & 1 \\ -1 & 1 \\ 0 & -1 \end{matrix} \right]^T \qquad b_0 = \left[ \begin{matrix} 2 \\ -2 \\ -3 \end{matrix} \right].$$ Here is the complete implementation in R:
require(quadprog)
Dmat <- 2*matrix(c(1,-1/2,-1/2,1), nrow = 2, byrow=TRUE)
dvec <- c(-3,2)
A    <- matrix(c(1,1,-1,1,0,-1), ncol = 2 , byrow=TRUE)
bvec <- c(2,-2,-3)
Amat <- t(A)
sol  <- solve.QP(Dmat, dvec, Amat, bvec, meq=0)

Note the parameter meq is used to tell quadprog that first meq constraint conditions should be treated as equalities. Let's inspect the output of the quadprog solver:
> sol
$solution
[1] 0.1666667 1.8333333
$value
[1] -0.08333333
$unconstrained.solution
[1] -1.3333333  0.3333333
$iterations
[1] 2 0
$Lagrangian
[1] 1.5 0.0 0.0
$iact
[1] 1
The point $(1/6,11/6)$ is the unique minimizer of $Q(x,y)$ subject to the constraint conditions. The point $(-4/3,1/3)$ is the unique minimum of $Q(x,y)$. The slots iterations, Lagrangian, and iact are diagnostics describing the performance of the quadprog algorithm. We'll provide a discussion of these values in a future post. Now, let's visualize the QP solution. For this, we superimpose the boundary of the feasible region on the contour plot of the surface $Q(x,y)$.
In this plot, dark green shading indicates lower altitude regions of the surface $Q(x,y)$, while lighter regions indicate higher altitudes. The red point is the global minimum of $Q(x,y)$ and the yellow point is the solution to the QP.
# Contour Plot with Feasible region overlay
require(lattice)
qp_sol <- sol$solution
uc_sol <- sol$unconstrained.solution
x <- seq(-2, 5.5, length.out = 500)
y <- seq(-1, 3.5, length.out = 500)
grid <- expand.grid(x=x, y=y)
grid$z <- with(grid, x^2 + y^2 -x*y +3*x -2*y + 4)
levelplot(z~x*y, grid,  cuts=30,
    panel = function(...){
      panel.levelplot(...)
      panel.polygon(c(2,5,-1),c(0,3,3), border=TRUE, lwd=4, col="transparent")
      panel.points(c(uc_sol[1],qp_sol[1]),
             c(uc_sol[2],qp_sol[2]),
             lwd=5, col=c("red","yellow"), pch=19)},
    colorkey=FALSE,
    col.regions = terrain.colors(30))

Example #2:

Suppose we have selected 10 stocks from which to build a portfolio $\Pi$. We want to determine how much of each stock to include in our portfolio.

The expected monthly return rate of our portfolio is $$ \overline{r_\Pi} = \sum_{i=1}^{10} w_i \overline{r_i} $$ where $\overline{r_i}$ is the mean monthly return rate on asset $i$ and $w_i$ is the fraction of the portfolio value due to asset $i$. Note that the portfolio weights $w_i$ satisfy the constraints $$ 0 \leq w_i \leq 1 \qquad \qquad \sum_{i=1}^{10} w_i = 1. $$ In practice, we can only estimate the average returns $\overline{r_i}$ using past price data. This is a snap using R's quantmod package:
# Get monthly return data from 2012 through 2013
require(quantmod)
myStocks <- c("AAPL","XOM","GOOG","MSFT","GE","JNJ","WMT","CVX","PG","WF")
getSymbols(myStocks ,src='yahoo')
returnsList <- lapply(myStocks, 
                   function(s) periodReturn(eval(parse(text=s)),
                                 period='monthly', subset='2012::2013'))
The data frame returns.df contains a time series of monthly returns for each of the 10 specified ticker symbols. Let's plot the monthly returns:
# Plot monthly return data
require(ggplot2)
require(reshape2)
returns2 <- as.data.frame(returns.df)
returns2$date <- row.names(returns2)
returns2 <- melt(returns2, id="date")
ggplot(returns2, aes(x=date,y=value, group=variable)) +
    geom_line(aes(color=variable), lwd=1.5) +
    ylab("Monthly Return")+ xlab("Date") +
    theme(axis.text.x = element_text(angle = 90, hjust = 1))

From this plot, we can see that there is significant fluctuation in return rates. This suggests that the variance of $r_i$ and covariances of $r_i$ and $r_j$ should play a role in our analysis. In effect, these variances and covariances indicate how likely we are to actually observe a portfolio return close to our expected return $\overline{r_\Pi}$.

To take into account the risk of deviations in our portfolio return, define the quadratic form $$Q(\vec{w}) = \vec{w}^T C \vec{w}, $$ where $C$ is the covariance matrix of the returns $r_i$. To solve the portfolio allocation problem, we'll try to determine the weights $\vec{w} = (w_1,...,w_{10})$ so that the risk function $Q(\vec{w})$ is minimized. But there are some restrictions to consider. In addition to requiring that $\sum_{i=1}^{10} w_i =1$ and $0 \leq w_i \leq 1$, we may also require a minimum return from the portfolio. For example, we might demand a minimum expected monthly return of 1%: $$ \sum_{i=1}^{10} w_i E(r_i) \geq .01.$$ We can prove that the covariance matrix $C$ is always symmetric positive definite (except in the case of perfect multicollinearity), so this constrained minimization problem is a quadratic programming problem of the type that can be handled by quadprog.

Let's now describe how to implement the quadprog solver for this problem. First, compute the average returns and covariance matrix in R:
# Compute the average returns and covariance matrix of the return series
r <- matrix(colMeans(returns.df), nrow=1)
C <- cov(returns.df)
The constraints in this case are a little bit more tricky than in Example #1. For quadprog, all of our constraints must be expressed in a linear system of the form $A^T \vec{w} \geq f$. The system should be arranged so that any equality constraints appear in the first $m$ rows of the system. We build up the matrix $A^T$ step by step using rbind and applying a transpose at the very end.

To enforce the sum to one condition we need: $$ \left[ \begin{matrix} 1 & 1 & \cdots 1 \end{matrix} \right] \vec{w} = 1. $$ This equality condition should appear first in the system. To enforce the minimum expected return, we require $r \cdot \vec{w} \geq .01$ where $r$ is the row of average return rates obtained from the dataset. To force $0 \leq w_i \leq 1$, we require $$ I_{10} \vec{w} \geq 0 \qquad \qquad -I_{10} \vec{w} \geq - 1$$ where $I_{10}$ is the $10 \times 10$ identity matrix. Putting these steps together:
# Stage and solve the QP
require(quadprog)
A    <- matrix(1,1,10)
A    <- rbind(A, r, diag(10),-diag(10))
f    <- c(1, 0.01, rep(0,10),rep(-1,10))
sol <- solve.QP(Dmat=C, dvec = rep(0,10), Amat=t(A), bvec=f, meq=1)
Let's inspect the optimal allocation:
require(ggplot2)
portfolio <- data.frame(name = myStocks, w = round(sol$solution,3))
ggplot(portfolio, aes(x=name, y=w)) + geom_bar(stat="identity", fill="blue")

The expected return from this allocation is about 1.2%. It is instructive to solve the quadratic program with different minimum return requirements. For example, there is a solution to this problem with minimum required expected return greater than or equal to 2% but no solution with minimum required expected return greater than or equal to 3%. What is key to note is that the solution with the 2% restriction has a higher value of $Q(\vec{w})$ (more risk) compared to the lower risk solution to the problem with the 1% restriction. As we'd expect, quadratic programming doesn't allow us to escape the risk-return trade-off; it only provides the lowest risk allocation for a given minimum expected return requirement.

To leave a comment for the author, please follow the link and comment on their blog: quantitate.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.