Retour aux activités
Séminaire informel de théorie des systèmes (ISS)

Decision Awareness in Reinforcement Learning


10 fév. 2023   12h00 — 13h00

Pierre-Luc Bacon Université de Montréal, Canada

Pierre-Luc Bacon

Présentation sur YouTube

Decision awareness is the learning principle according to which the components of a learning system ought to be optimized directly to satisfy the global performance criterion: to produce optimal decisions. This end-to-end perspective has recently led to significant advances in model-based reinforcement learning by addressing the problem of compounding errors plaguing alternative approaches. In this talk, I will present some of our recent work on this topic: 1. on learning control-oriented transition models by implicit differentiation and 2. on learning neural ordinary differential equations end-to-end for nonlinear trajectory optimization. Along the way, we will also discuss some of the computational challenges associated with those methods and our attempts at scaling up performance, specifically: using an efficient factorization of the Jacobians in the forward mode of automatic differentiation through novel constrained optimizers inspired by adversarial learning.

Biography: Pierre-Luc Bacon is an assistant professor at the University of Montreal in the Computer Science and Operations Research department. He is also a core member of Mila and Ivado and a Facebook CIFAR chair holder. He leads a research group of 15 students working on the challenge posed by the curse of the horizon in reinforcement learning and optimal control.

Peter E. Caines responsable
Aditya Mahajan responsable
Shuang Gao responsable


Montréal Québec

Axes de recherche