Groupe d’études et de recherche en analyse des décisions


Generalized Pattern Searches with Derivative Information

, et

A common question asked by users of direct search algorithms is how to use derivative information at iterates where it is available. This paper addresses that question with respect to Generalized Pattern Search (GPS) methods for unconstrained and linearly constrained optimization. The major effect is in the expensive poll step of GPS. Polling is done to certify the need to refine the current mesh, and it requires O(n) function evaluations in the worst case. We show that the use of gradients significantly reduces the maximum number of function evaluations necessary for poll steps, even to a worst case of a single function evaluation with certain algorithmic choices. Furthermore, we show that rather rough approximations to the gradient are sufficient to reduce the poll step to a single function evaluation. We prove that using these less expensive poll steps does not weaken the provable convergence properties of the method. Conventional wisdom regarding gradient-based methods would cause us to suspect that the use of derivatives may lead GPS to a different solution with a higher objective function value than non derivative GPS from the same starting point. Results using the cute test set do not support this general conclusion for GPS, even with an empty search. Using the same underlying set of directions, sometimes the version with gradients finds a better solution, and sometimes the derivative free version does. The derivative and non derivative versions do frequently take different paths or find different solutions, and there is sometimes a degradation in the quality of the solutions when rough gradient information is used. In practice, a nonempty search would probably alleviate even this effect, and we give a simple example to illustrate this point.

, 28 pages