The simultaneous stochastic optimization of mining complexes (SSOMC) is a large-scale stochastic combinatorial optimization problem that simultaneously manages the extraction of materials from multiple mines and their processing using interconnected facilities to generate a set of final products, while taking into account material supply (geological) uncertainty to manage the associated risk. Although simulated annealing has been shown to outperform comparing methods for solving the SSOMC, early performance might dominate recent performance in that a combination of the heuristics' performance is used to determine which perturbations to apply. This work proposes a data-driven framework for heuristic scheduling in a fully self-managed hyper-heuristic to solve the SSOMC. The proposed learn-to-perturb (L2P) hyper-heuristic is a multi-neighborhood simulated annealing algorithm. The L2P selects the heuristic (perturbation) to be applied in a self-adaptive manner using reinforcement learning to efficiently explore which local search is best suited for a particular search point. Several state-of-the-art agents have been incorporated into L2P to better adapt the search and guide it towards better solutions. By learning from data describing the performance of the heuristics, a problem-specific ordering of heuristics that collectively finds better solutions faster is obtained. L2P is tested on several real-world mining complexes, with an emphasis on efficiency, robustness, and generalization capacity. Results show a reduction in the number of iterations by 30-50% and in the computational time by 30-45%.
Published February 2022 , 28 pages
G2203.pdf (2 MB)