Quadratic programming is the problem of finding a vector *x* that
minimizes a quadratic function, possibly subject to linear constraints:

$$\underset{x}{\mathrm{min}}\frac{1}{2}{x}^{T}Hx+{c}^{T}x$$

such that *A·x* ≤ *b*, *Aeq·x* = *beq*, *l* ≤ *x* ≤ *u*.

`interior-point-convex`

`quadprog`

AlgorithmThe `interior-point-convex`

algorithm performs the
following steps:

The algorithm begins by attempting to simplify the problem by removing redundancies and simplifying constraints. In particular, the presolve portion performs the following tasks, among others:

Check if any variables have equal upper and lower bounds. If so, check for feasibility, then fix and remove the variables.

Check if any linear inequality constraints involve just one variable. If so, check for feasibility, and change the linear constraint to a bound.

Check if any linear equality constraints involve just one variable. If so, check for feasibility, then fix and remove the variable.

Check if any linear constraint matrix has zero rows. If so, check for feasibility, and delete the rows.

Check if the bounds and linear constraints are consistent.

Check if any variables appear only as linear terms in the objective function and do not appear in any linear constraint. If so, check for feasibility and boundedness, and fix the variables at their appropriate bounds.

In the presolve step the algorithm might detect an infeasible or unbounded problem. If so, the algorithm halts and issues an appropriate exit message.

If the algorithm does not detect an infeasible or unbounded problem in the presolve step, it continues with the other steps. At the end, it reconstructs the original problem, undoing any presolve transformations. This final step is the postsolve step.

For details, see Gould and Toint [63].

The initial point `x0`

for the algorithm is:

Initialize

`x0`

to`ones(n,1)`

, where`n`

is the number of rows in*H*.For components that have both an upper bound

`ub`

and a lower bound`lb`

, if a component of`x0`

is not strictly inside the bounds, the component is set to`(ub + lb)/2`

.For components that have only one bound, modify the component if necessary to lie strictly inside the bound.

Similar to the `fmincon`

interior-point algorithm,
the `interior-point-convex`

algorithm tries to find a point
where the Karush-Kuhn-Tucker
(KKT) conditions hold. For the quadratic programming problem
described in Quadratic Programming Definition, these conditions
are:

$$\begin{array}{c}Hx+c-{A}_{eq}^{T}y-{\overline{A}}^{T}z=0\\ \overline{A}x-\overline{b}-s=0\\ {A}_{eq}x-{b}_{eq}=0\\ {s}_{i}{z}_{i}=0,\text{}i=1,2,\mathrm{...},m\\ s\ge 0\text{\hspace{0.17em}}\\ z\ge 0.\end{array}$$

Here

$$\overline{A}$$ is the extended linear inequality matrix that includes bounds written as linear inequalities. $$\overline{b}$$ is the corresponding linear inequality vector, including bounds.

*s*is the vector of slacks that convert inequality constraints to equalities.*s*has length*m*, the number of linear inequalities and bounds.*z*is the vector of Lagrange multipliers corresponding to*s*.*y*is the vector of Lagrange multipliers associated with the equality constraints.

The algorithm first predicts a step from the Newton-Raphson
formula, then computes a corrector step. The corrector attempts to
better enforce the nonlinear constraint *s _{i}z_{i}* = 0.

Definitions for the predictor step:

*r*, the dual residual:_{d}$${r}_{d}=Hx+c-{A}_{eq}^{T}y-{\overline{A}}^{T}z.$$

*r*, the primal equality constraint residual:_{eq}$${r}_{eq}={A}_{eq}x-{b}_{eq}.$$

*r*, the primal inequality constraint residual, which includes bounds and slacks:_{ineq}$${r}_{ineq}=\overline{A}x-\overline{b}-s.$$

*r*, the complementarity residual:_{sz}*r*=_{sz}*Sz*.*S*is the diagonal matrix of slack terms,*z*is the column matrix of Lagrange multipliers.*r*, the average complementarity:_{c}$${r}_{c}=\frac{{s}^{T}z}{m}.$$

In a Newton step, the changes in *x*, *s*, *y*,
and *z*, are given by:

$$\left(\begin{array}{cccc}H& 0& -{A}_{eq}^{T}& -{\overline{A}}^{T}\\ {A}_{eq}& 0& 0& 0\\ \overline{A}& -I& 0& 0\\ 0& Z& 0& S\end{array}\right)\left(\begin{array}{c}\Delta x\\ \Delta s\\ \Delta y\\ \Delta z\end{array}\right)=-\left(\begin{array}{c}{r}_{d}\\ {r}_{eq}\\ {r}_{ineq}\\ {r}_{sz}\end{array}\right).$$

However, a full Newton step might be infeasible, because of
the positivity constraints on *s* and *z*.
Therefore, `quadprog`

shortens the step, if necessary,
to maintain positivity.

Additionally, to maintain a "centered" position
in the interior, instead of trying to solve *s _{i}z_{i}* = 0, the algorithm
takes a positive parameter

*s _{i}z_{i}* =

`quadprog`

replaces *r _{sz}* in
the Newton step equation with

`quadprog`

reorders the
Newton equations to obtain a symmetric, more numerically stable system
for the predictor step calculation.For details, see Mehrotra [47].

After calculating the corrected Newton step, `quadprog`

can
perform more calculations to get both a longer current step, and to
prepare for better subsequent steps. These multiple correction calculations
can improve both performance and robustness. For details, see Gondzio [62].

`quadprog`

calculates a *merit function* *φ* at
every iteration. The merit function is a measure of feasibility, and
is also called *total relative error*. `quadprog`

stops
if the merit function grows too large. In this case, `quadprog`

declares
the problem to be infeasible.

The merit function is related to the KKT conditions for the problem—see Predictor-Corrector. Use the following definitions:

$$\begin{array}{c}\rho =\mathrm{max}\left(1,\Vert H\Vert ,\Vert \overline{A}\Vert ,\Vert {A}_{eq}\Vert ,\Vert c\Vert ,\Vert \overline{b}\Vert ,\Vert {b}_{eq}\Vert \right)\\ {r}_{\text{eq}}={A}_{\text{eq}}x-{b}_{\text{eq}}\\ {r}_{\text{ineq}}=\overline{A}x-\overline{b}+s\\ {r}_{\text{d}}=Hx+c+{A}_{\text{eq}}^{T}{\lambda}_{\text{eq}}+{\overline{A}}^{T}{\overline{\lambda}}_{\text{ineq}}\\ g={x}^{T}Hx+{f}^{T}x-{\overline{b}}^{T}{\overline{\lambda}}_{\text{ineq}}-{b}_{\text{eq}}^{T}{\lambda}_{\text{eq}}.\end{array}$$

The notation $$\overline{A}$$ and $$\overline{b}$$ means the linear inequality
coefficients, augmented with terms to represent bounds. The notation $${\overline{\lambda}}_{\text{ineq}}$$ similarly represents Lagrange
multipliers for the linear inequality constraints, including bound
constraints. This was called *z* in Predictor-Corrector, and $${\lambda}_{\text{eq}}$$ was called *y*.

The merit function *φ* is

$$\frac{1}{\rho}\left(\mathrm{max}\left({\Vert {r}_{\text{eq}}\Vert}_{\infty},{\Vert {r}_{\text{ineq}}\Vert}_{\infty},{\Vert {r}_{\text{d}}\Vert}_{\infty}\right)+g\right).$$

`quadprog`

iterative display includes a column
showing the merit function under the heading ```
Total relative
error
```

.

`trust-region-reflective`

`quadprog`

AlgorithmMany of the methods used in Optimization Toolbox™ solvers
are based on *trust regions,* a simple yet powerful
concept in optimization.

To understand the trust-region approach to optimization, consider
the unconstrained minimization problem, minimize *f*(*x*),
where the function takes vector arguments and returns scalars. Suppose
you are at a point *x* in *n*-space
and you want to improve, i.e., move to a point with a lower function
value. The basic idea is to approximate *f* with
a simpler function *q*, which reasonably reflects
the behavior of function *f* in a neighborhood *N* around
the point *x*. This neighborhood is the trust region.
A trial step *s* is computed by minimizing (or approximately
minimizing) over *N*. This is the trust-region subproblem,

$$\underset{s}{\mathrm{min}}\left\{q(s),\text{}s\in N\right\}.$$ | (9-1) |

The current point is updated to be *x* + *s* if *f*(*x* + *s*) < *f*(*x*);
otherwise, the current point remains unchanged and *N*,
the region of trust, is shrunk and the trial step computation is repeated.

The key questions in defining a specific trust-region approach
to minimizing *f*(*x*) are how to
choose and compute the approximation *q* (defined
at the current point *x*), how to choose and modify
the trust region *N*, and how accurately to solve
the trust-region subproblem. This section focuses on the unconstrained
problem. Later sections discuss additional complications due to the
presence of constraints on the variables.

In the standard trust-region method ([48]), the quadratic approximation *q* is
defined by the first two terms of the Taylor approximation to *F* at *x*;
the neighborhood *N* is usually spherical or ellipsoidal
in shape. Mathematically the trust-region subproblem is typically
stated

$$\mathrm{min}\left\{\frac{1}{2}{s}^{T}Hs+{s}^{T}g\text{suchthat}\Vert Ds\Vert \le \Delta \right\},$$ | (9-2) |

where *g* is the gradient of *f* at
the current point *x*, *H* is the
Hessian matrix (the symmetric matrix of second derivatives), *D* is
a diagonal scaling matrix, Δ is a positive scalar, and ∥
. ∥ is the 2-norm. Good algorithms exist for solving Equation 9-2 (see [48]); such algorithms typically involve the
computation of a full eigensystem and a Newton process applied to
the secular equation

$$\frac{1}{\Delta}-\frac{1}{\Vert s\Vert}=0.$$

Such algorithms provide an accurate solution to Equation 9-2. However, they
require time proportional to several factorizations of *H*.
Therefore, for large-scale problems a different approach is needed.
Several approximation and heuristic strategies, based on Equation 9-2, have been proposed
in the literature ([42] and [50]). The approximation approach followed
in Optimization Toolbox solvers is to restrict the trust-region
subproblem to a two-dimensional subspace *S* ([39] and [42]).
Once the subspace *S* has been computed, the work
to solve Equation 9-2 is
trivial even if full eigenvalue/eigenvector information is needed
(since in the subspace, the problem is only two-dimensional). The
dominant work has now shifted to the determination of the subspace.

The two-dimensional subspace *S* is
determined with the aid of a preconditioned
conjugate gradient process described below. The solver defines *S* as
the linear space spanned by *s*_{1} and *s*_{2},
where *s*_{1} is in the direction
of the gradient *g*, and *s*_{2} is
either an approximate Newton direction, i.e.,
a solution to

$$H\cdot {s}_{2}=-g,$$ | (9-3) |

or a direction of negative curvature,

$${s}_{2}^{T}\cdot H\cdot {s}_{2}<0.$$ | (9-4) |

The philosophy behind this choice of *S* is
to force global convergence (via the steepest descent direction or
negative curvature direction) and achieve fast local convergence (via
the Newton step, when it exists).

A sketch of unconstrained minimization using trust-region ideas is now easy to give:

Formulate the two-dimensional trust-region subproblem.

Solve Equation 9-2 to determine the trial step

*s*.If

*f*(*x*+*s*) <*f*(*x*), then*x*=*x*+*s*.Adjust Δ.

These four steps are repeated until convergence. The trust-region
dimension Δ is adjusted according to standard rules. In particular,
it is decreased if the trial step is not accepted, i.e., *f*(*x* + *s*)
≥ *f*(*x*).
See [46] and [49] for a discussion of this aspect.

Optimization Toolbox solvers treat a few important special
cases of *f* with specialized functions: nonlinear
least-squares, quadratic functions, and linear least-squares. However,
the underlying algorithmic ideas are the same as for the general case.
These special cases are discussed in later sections.

The subspace trust-region method is used to determine a search direction. However, instead of restricting the step to (possibly) one reflection step, as in the nonlinear minimization case, a piecewise reflective line search is conducted at each iteration. See [45] for details of the line search.

A popular way to solve large symmetric positive definite systems
of linear equations *Hp* = –*g* is the
method of Preconditioned Conjugate Gradients (PCG). This iterative
approach requires the ability to calculate matrix-vector products
of the form *H·v* where *v* is
an arbitrary vector. The symmetric positive definite matrix *M *is
a* preconditioner* for *H*.
That is, *M* = *C*^{2},
where *C*^{–1}*HC*^{–1} is
a well-conditioned matrix or a matrix with clustered eigenvalues.

In a minimization context, you can assume that the Hessian matrix *H* is
symmetric. However, *H* is guaranteed to be positive
definite only in the neighborhood of a strong minimizer. Algorithm
PCG exits when a direction of negative (or zero) curvature is encountered,
i.e., *d ^{T}Hd* ≤ 0. The
PCG output direction,

Linear constraints complicate the situation described for unconstrained minimization. However, the underlying ideas described previously can be carried through in a clean and efficient way. The trust-region methods in Optimization Toolbox solvers generate strictly feasible iterates.

The general linear equality constrained minimization problem can be written

$$\mathrm{min}\left\{f(x)\text{suchthat}Ax=b\right\},$$ | (9-5) |

where *A* is an *m*-by-*n* matrix
(*m* ≤ *n*). Some Optimization Toolbox solvers
preprocess *A* to remove strict linear dependencies
using a technique based on the LU factorization of *A ^{T}* [46]. Here

The method used to solve Equation 9-5 differs from the unconstrained approach
in two significant ways. First, an initial feasible point *x*_{0} is
computed, using a sparse least-squares step, so that *Ax*_{0} = *b*.
Second, Algorithm PCG is replaced with Reduced Preconditioned Conjugate
Gradients (RPCG), see [46], in order
to compute an approximate reduced Newton step (or a direction of negative
curvature in the null space of *A*). The key linear
algebra step involves solving systems of the form

$$\left[\begin{array}{cc}C& {\tilde{A}}^{T}\\ \tilde{A}& 0\end{array}\right]\left[\begin{array}{c}s\\ t\end{array}\right]=\left[\begin{array}{c}r\\ 0\end{array}\right],$$ | (9-6) |

where $$\tilde{A}$$ approximates *A* (small
nonzeros of *A* are set to zero provided rank is
not lost) and *C* is a sparse symmetric positive-definite
approximation to *H*, i.e., *C* = *H*.
See [46] for more details.

The box constrained problem is of the form

$$\mathrm{min}\left\{f(x)\text{suchthat}l\le x\le u\right\},$$ | (9-7) |

where *l* is a vector of lower bounds, and *u* is
a vector of upper bounds. Some (or all) of the components of *l* can
be equal to –∞ and some (or all) of the components of *u* can
be equal to ∞. The method generates a sequence of strictly
feasible points. Two techniques are used to maintain feasibility while
achieving robust convergence behavior. First, a scaled modified Newton
step replaces the unconstrained Newton step (to define the two-dimensional
subspace *S*). Second, reflections
are used to increase the step size.

The scaled modified Newton step arises from examining the Kuhn-Tucker necessary conditions for Equation 9-7,

$${\left(D(x)\right)}^{-2}g=0,$$ | (9-8) |

where

$$D(x)=\text{diag}\left({\left|{v}_{k}\right|}^{-1/2}\right),$$

and the vector *v*(*x*) is
defined below, for each 1 ≤ *i* ≤ *n*:

If

*g*< 0 and_{i}*u*< ∞ then_{i}*v*=_{i}*x*–_{i}*u*_{i}If

*g*≥ 0 and_{i}*l*> –∞ then_{i}*v*=_{i}*x*–_{i}*l*_{i}If

*g*< 0 and_{i}*u*= ∞ then_{i}*v*= –1_{i}If

*g*≥ 0 and_{i}*l*= –∞ then_{i}*v*= 1_{i}

The nonlinear system Equation 9-8 is not differentiable everywhere.
Nondifferentiability occurs when *v _{i}* = 0. You can avoid
such points by maintaining strict feasibility, i.e., restricting

The scaled modified Newton step *s _{k}* for
the nonlinear system of equations given by Equation 9-8 is defined as the solution to the
linear system

$$\widehat{M}D{s}^{N}=-\widehat{g}$$ | (9-9) |

at the *k*th iteration, where

$$\widehat{g}={D}^{-1}g=\text{diag}\left({\left|v\right|}^{1/2}\right)g,$$ | (9-10) |

and

$$\widehat{M}={D}^{-1}H{D}^{-1}+\text{diag}(g){J}^{v}.$$ | (9-11) |

Here *J ^{v}* plays
the role of the Jacobian of |

Second, reflections are used to increase
the step size. A (single) reflection step is defined as follows. Given
a step *p* that intersects a bound constraint, consider
the first bound constraint crossed by *p*; assume
it is the *i*th bound constraint (either the *i*th
upper or *i*th lower bound). Then the reflection
step *p ^{R}* =

`active-set`

`quadprog`

AlgorithmRecall the problem `quadprog`

addresses:

$$\underset{x}{\mathrm{min}}\frac{1}{2}{x}^{T}Hx+{c}^{T}x$$ | (9-12) |

such that *A·x* ≤ *b*, *Aeq·x* = *beq*, and *l* ≤ *x* ≤ *u*. *m* is
the total number of linear constraints, the sum of number of rows
of *A* and of *Aeq*.

The `quadprog`

`active-set`

algorithm
is an active-set strategy (also known as a projection method) similar
to that of Gill et al., described in [18] and [17]. It has been modified for both Linear
Programming (LP) and Quadratic Programming (QP) problems.

The solution procedure involves two phases. The first phase involves the calculation of a feasible point (if one exists). The second phase involves the generation of an iterative sequence of feasible points that converge to the solution.

In this method an active set matrix, *S _{k}*,
is maintained that is an estimate of the active constraints (i.e.,
those that are on the constraint boundaries) at the solution point.
Specifically, the active set

The matrix *Z _{k}* is formed
from the last

$${Z}_{k}=Q\left[:,l+1:m\right],$$ | (9-13) |

where

$${Q}^{T}{S}_{k}^{T}=\left[\begin{array}{c}R\\ 0\end{array}\right].$$

Once *Z _{k}* is found,
a search direction

Then if you view the quadratic objective function as a function
of *p*, by substituting for *d _{k}*,
the result is

$$q(p)=\frac{1}{2}{p}^{T}{Z}_{k}^{T}H{Z}_{k}p+{c}^{T}{Z}_{k}p.$$ | (9-14) |

Differentiating this with respect to *p* yields

$$\nabla q(p)={Z}_{k}^{T}H{Z}_{k}p+{Z}_{k}^{T}c.$$ | (9-15) |

∇*q*(*p*) is
referred to as the projected gradient of the quadratic function because
it is the gradient projected in the subspace defined by *Z _{k}*.
The term $${Z}_{k}^{T}H{Z}_{k}$$ is called the projected Hessian.
Assuming the Hessian matrix

$${Z}_{k}^{T}H{Z}_{k}p=-{Z}_{k}^{T}c.$$ | (9-16) |

The next step is

$${x}_{k+1}={x}_{k}+\alpha {d}_{k},\text{where}{d}_{k}={Z}_{k}^{T}p.$$ | (9-17) |

At each iteration, because of the quadratic nature of the objective
function, there are only two choices of step length *α*.
A step of unity along *d _{k}* is
the exact step to the minimum of the function restricted to the null
space of

$$\alpha =\underset{i\in \left\{1,\mathrm{...},m\right\}}{\mathrm{min}}\left\{\frac{-\left({A}_{i}{x}_{k}-{b}_{i}\right)}{{A}_{i}{d}_{k}}\right\},$$ | (9-18) |

which is defined for constraints not in the active set, and
where the direction *d _{k}* is
towards the constraint boundary, i.e., $${A}_{i}{d}_{k}>0,\text{}i=1,\mathrm{...},m$$.

Lagrange multipliers, *λ _{k}*,
are calculated that satisfy the nonsingular set of linear equations

$${S}_{k}^{T}{\lambda}_{k}=c.$$ | (9-19) |

If all elements of *λ _{k}* are
positive,

The algorithm requires a feasible point to start. If the initial point is not feasible, then you can find a feasible point by solving the linear programming problem

$$\begin{array}{l}\underset{\gamma \in \Re ,\text{}x\in {\Re}^{n}}{\mathrm{min}}\gamma \text{suchthat}\\ {A}_{i}x={b}_{i},\text{}i=1,\mathrm{...},{m}_{e}\text{(therowsof}Aeq)\\ {A}_{i}x-\gamma \le {b}_{i},\text{}i={m}_{e}+1,\mathrm{...},m\text{(therowsof}A).\end{array}$$ | (9-20) |

The notation *A _{i}* indicates
the

You can modify the preceding QP algorithm
for LP problems by setting the search direction *d* to
the steepest descent direction at each iteration, where *g _{k}* is
the gradient of the objective function (equal to the coefficients
of the linear objective function):

$$d=-{Z}_{k}{Z}_{k}^{T}{g}_{k}.$$ | (9-21) |

If a feasible point is found using the preceding LP method,
the main QP phase is entered. The search direction *d _{k}* is
initialized with a search direction

$$H{d}_{1}=-{g}_{k},$$ | (9-22) |

where *g _{k}* is the gradient
of the objective function at the current iterate

Was this topic helpful?