Computing good directions of negative curvature in optimization

Erik G. Boman

SCCM, Gates Bldg. 2B
Stanford University
Stanford, CA 94305-9025

Co-author: Walter Murray, Stanford

Abstract

When solving a nonconvex optimization problem using a modified Newton type algorithm, a descent direction may not exist near a first order optimality point. In these cases it is important to find a direction of negative curvature, i.e. a direction d such that dHd<0, where H is the Hessian of the objective function. Such a d exists if and only if H has at least one negative eigenvalue. A good direction of negative curvature has negative curvature bounded away from zero, and should preferably be close to the smallest eigenvalue.

Forsgren, Gill, and Murray (1995) showed how to obtain a provably satisfactory direction of negative curvature using a partial Cholesky factorization of H. This algorithm requires partial pivoting. We suggest instead an iterative method. First we consider the case where some type of approximate Cholesky factorization is needed for other purposes. We propose to compute a modified Cholesky factorization, which requires no pivoting. From this, we can always find a direction of negative curvature (if one exists), but it may be very poor. To remedy this we use an iterative method to improve the direction we found initially. The best candidates for this iterative process are Lanczos, Chebyshev iteration, and the Jacobi-Davidson method. We show empirically that in most cases, after a small number of iterations (typically 2-10) a good direction of negative curvature is obtained. Which method is the better depends on the distribution of the eigenvalues of H. One interesting feature of the Jacobi-Davidson method is we can use the modified Cholesky factorization we already have computed as a preconditioner in the inner iteration.

The second scenario is when it is impractical to form the Hessian matrix explicitly. In such cases, we need to use a purely iterative method which only requires matrix-vector products. We start with a random initial vector and iterate using one of the methods described above. Obviously, more iterations are now required to get a good direction since initially the curvature is most likely positive.