Retour aux activités
Discussion DS4DM autour d'un café

Representation-driven Option Discovery in Reinforcement Learning

iCalendar

23 août 2023   15h00 — 16h00

Marlos C. Machado University of Alberta, Canada

Marlos C. Machado

Présentation sur YouTube.

The ability to reason at multiple levels of temporal abstraction is a fundamental aspect of intelligence. In reinforcement learning, this attribute is often modeled through temporally extended courses of actions called options. Despite the popularity of options as a research topic, they are seldom included as an explicit component in traditional solutions within the field. In this talk, I will try to provide an answer for why this is the case and emphasize the vital role options can play in continual learning. Rather than assuming a predetermined set of options, I will introduce a general framework for option discovery, which utilizes the agent's representation to discover useful options. By leveraging these options to generate a rich stream of experience, the agent can improve its representations and learn more effectively. This representation-driven option discovery approach creates a virtuous cycle of refinement, continuously improving both the representation and options, and it is particularly effective for problems that require agents to exhibit different levels of abstractions to succeed.


Bio: Marlos is an assistant professor at the University of Alberta. Marlos's research interests lie broadly in machine learning, specifically in (deep) reinforcement learning, representation learning, continual learning, and real-world applications of all the above. He completed his B.Sc. and M.Sc. at UFMG, Brazil, and his Ph.D. at the University of Alberta. During his Ph.D., among other things, he popularized the idea of temporally-extended exploration through options, introducing the idea of eigenoptions. He was a researcher at DeepMind and at Google Brain for four years; during which time he made several contributions to reinforcement learning, including the application of deep reinforcement learning to control Loon's stratospheric balloons.

Federico Bobbio responsable
Defeng Liu responsable

Lieu

Activité hybride au GERAD
Zoom et salle 4488
Pavillon André-Aisenstadt
Campus de l'Université de Montréal
2920, chemin de la Tour

Montréal Québec H3T 1J4
Canada

Organisme associé