New digital technologies including the development of advanced sensors and monitoring devices have enabled a mining complex to acquire new information about the performance of its different components. The short-term production plan, which determines production decisions about extraction sequence, destination policies, fleet assignment and allocation, and processing stream utilization is significantly dependent on the performance of and interaction between the different components of the mining complex. A minor change in the performance of any component can significantly change the processing capabilities and consequently the net revenue of a mining complex. Existing technologies, though they integrate conventionally collected new information, are not able to integrate the new incoming information to adapt the short-term production plan. This paper presents a new continuous updating framework based on policy gradient reinforcement learning and ensemble Kalman filter to adapt the short-term production plan regarding destination policies of material in a mining complex with new information. The framework first uses ensemble Kalman filter to update the uncertainty models of the different components of a mining complex with new information. Then, the updated uncertainty models are utilized in a neural network based policy gradient reinforcement learning algorithm to adapt the short-term destination policies of material in a mining complex. The proposed framework is applied to a copper-gold mining complex, which shows its ability to adapt the short-term destination policies of material with new information. The framework better meets the different production targets while improving the cumulative cash flow compared to the industry standard fixed cut-off grade policy.
Published October 2018 , 28 pages