Groupe d’études et de recherche en analyse des décisions

Exploiting stochastic factorization for efficiently solving Markov decision processes

Doina Precup School of Computer Science, Université McGill, Canada

When a transition probability matrix is represented as the product of two stochastic matrices, one can swap the factors of the multiplication to obtain another transition matrix that retains some fundamental characteristics of the original. Since the derived matrix can be much smaller than its precursor, this property can be exploited in the context of solving Markov decision processes (MDPs). I will describe how we can use this property in order to provide approximate solutions for MDPs much faster than by using classical methods. For example, an approximate policy iteration algorithm based on stochastic factorization has linear dependence on the number of states in the model. I will briefly also discuss learning algorithm based on this trick, and its relationship to other types of matrix factorization, which we are beginning to uncover. This is joint work with Andre M.S. Baretto and Joelle Pineau.

Entrée gratuite.
Bienvenue à tous.