Groupe d’études et de recherche en analyse des décisions

Sequential Decision Making under Parameter Uncertainty

Shie Mannor Professeur adjoint, Département de génie électrique et informatique, Université McGill, Canada

Markov decision processes are an effective tool in modeling decision-making in uncertain dynamic environments. The parameters of these models are often estimated from data, learned from experience, or designed by hand. It is therefore not surprising that the actual performance of a chosen strategy often significantly differs from the designer's initial expectations due to unavoidable modeling ambiguity. In this talk we address this uncertainty in the model parameters and its ramifications on decision making in dynamic environments. We start with highlighting the magnitude of the problem in a real-world data intensive decision problem. We then consider a methodological approach that enables the decision maker to take this uncertainty into account. By taking a Bayesian perspective we can consider a percentile optimization approach that allows the decision maker to naturally optimize a desired level of risk measured in terms of percentile performance. We show that some forms of this uncertainty can be efficiently solved and others are NP-hard. We then explain how to address a very high dimensional state space by using non-parametric statistics tools such as Gaussian processes to approximate the value function.