|On this page…|
First-order optimality is a measure of how close a point x is to optimal. Most Optimization Toolbox™ solvers use this measure, though it has different definitions for different algorithms. First-order optimality is a necessary condition, but it is not a sufficient condition. In other words:
The first-order optimality measure must be zero at a minimum.
A point with first-order optimality equal to zero is not necessarily a minimum.
For general information about first-order optimality, see Nocedal and Wright . For specifics about the first-order optimality measures for Optimization Toolbox solvers, see Unconstrained Optimality, Constrained Optimality Theory, and Constrained Optimality in Solver Form.
The TolFun tolerance relates to the first-order optimality measure. Typically, if the first-order optimality measure is less than TolFun, solver iterations end.
Some solvers or algorithms use relative first-order optimality as a stopping criterion. Solver iterations end if the first-order optimality measure is less than μ times TolFun, where μ is either:
The infinity norm (maximum) of the gradient of the objective function at x0
The infinity norm (maximum) of inputs to the solver, such as f or b in linprog or H in quadprog
A relative measure attempts to account for the scale of a problem. Multiplying an objective function by a very large or small number does not change the stopping condition for a relative stopping criterion, but does change it for an unscaled one.
Solvers with enhanced exit messages state, in the stopping criteria details, when they use relative first-order optimality.
For a smooth unconstrained problem,
the first-order optimality measure is the infinity norm (meaning maximum absolute value) of ∇f(x), which is:
This measure of optimality is based on the familiar condition for a smooth function to achieve a minimum: its gradient must be zero. For unconstrained problems, when the first-order optimality measure is nearly zero, the objective function has gradient nearly zero, so the objective function could be near a minimum. If the first-order optimality measure is not small, the objective function is not minimal.
This section summarizes the theory behind the definition of first-order optimality measure for constrained problems. The definition as used in Optimization Toolbox functions is in Constrained Optimality in Solver Form.
For a smooth constrained problem, let g and h be vector functions representing all inequality and equality constraints respectively (meaning bound, linear, and nonlinear constraints):
The meaning of first-order optimality in this case is more complex than for unconstrained problems. The definition is based on the Karush-Kuhn-Tucker (KKT) conditions. The KKT conditions are analogous to the condition that the gradient must be zero at a minimum, modified to take constraints into account. The difference is that the KKT conditions hold for constrained problems.
The vector λ, which is the concatenation of λg and λh, is the Lagrange multiplier vector. Its length is the total number of constraints.
Solvers do not use the three expressions in Equation 3-4 in the calculation of optimality measure.
The optimality measure associated with Equation 3-2 is
The optimality measure associated with Equation 3-3 is
where the norm in Equation 3-6 means infinity norm (maximum) of the vector .
The combined optimality measure is the maximum of the values calculated in Equation 3-5 and Equation 3-6. Solvers that accept nonlinear constraint functions report constraint violations g(x) > 0 or |h(x)| > 0 as TolCon tolerance violations. See Tolerances and Stopping Criteria.
Most constrained toolbox solvers separate their calculation of first-order optimality measure into bounds, linear functions, and nonlinear functions. The measure is the maximum of the following two norms, which correspond to Equation 3-5 and Equation 3-6:
where the norm of the vectors in Equation 3-7 and Equation 3-8 is the infinity norm (maximum). The subscripts on the Lagrange multipliers correspond to solver Lagrange multiplier structures. See Lagrange Multiplier Structures. The summations in Equation 3-7 range over all constraints. If a bound is ±Inf, that term is not constrained, so it is not part of the summation.
For some large-scale problems with only linear equalities, the first-order optimality measure is the infinity norm of the projected gradient. In other words, the first-order optimality measure is the size of the gradient projected onto the null space of Aeq.
For least-squares solvers and trust-region-reflective algorithms, in problems with bounds alone, the first-order optimality measure is the maximum over i of |vi*gi|. Here gi is the ith component of the gradient, x is the current point, and
If xi is at a bound, vi is zero. If xi is not at a bound, then at a minimizing point the gradient gi should be zero. Therefore the first-order optimality measure should be zero at a minimizing point.