Integrating short-term stochastic production planning updating to mining fleet management in industrial mining complexes: An actor-critic reinforcement learning approach


référence BibTeX

Short-term production planning in mining complexes involves a series of decisions concerning activities and processes to accomplish the long-term planning targets. As the planning resolution becomes more detailed, additional operational aspects must be considered, together with the production scheduling decisions. An important aspect is that, as the mining complex operates, a flow of information is continuously collected throughout diverse activities. Such data can be used to update the uncertainty models, and consequently, influence subsequent decisions. This paper presents an Actor-Critic reinforcement learning (RL) method, where two RL agents are proposed to provide equipment allocation and production scheduling decisions to maximize the operation's profitability. The first agent allocates shovels to mining fronts, while the second one defines destination policies of the material being extracted and the number of required trucks. The material flow resulting from the RL agents' decisions is simulated by the mining complex model. This changes the state of the mining complex and generates reward values to evaluate the agents' actions. In addition, sampling stations placed on conveyor belts provide data characterizing the material passing through them. These data update the uncertain orebody models, which also changes how the RL agents perceive the mining complex state. This way, when a new decision is requested, the decision can be made potentially in real-time, given that the RL agents have experienced many different situations and know how to adapt to maximize the reward collection. A case study is presented at a copper mining complex highlighting the method's ability to adapt and make informed decisions while collecting new information.

, 25 pages

Axe de recherche

Applications de recherche