Groupe d’études et de recherche en analyse des décisions

Adaptive learning solutions for dynamic graphical games

Mohammed Abouheaf Polytechnique Montréal, Canada

A new class of games known as dynamic graphical games is recently developed, where the information flow between agents is restricted by a communication graph structure. Cooperative control ideas are used to attain synchronization among the agents’ dynamics to the leader dynamics. The agents’ error dynamics are coupled dynamical systems driven by the local control input of each agent and all its neighbours. This structure arises from the nature of the synchronization problem for dynamic systems on communication graphs. Therefore, the graphical game is evaluated using a performance index that depends only on the local information available to each agent. The Hamiltonian mechanics are employed to derive the necessary conditions for optimality. Furthermore, Novel coupled Bellman equations and Hamiltonian functions are derived for the dynamic graphical games. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. Adaptive learning techniques are extended to solve the dynamic graphical games in real-time. Adaptive or Reinforcement Learning is concerned with learning from interaction in a dynamic environment. The agent picks its action in such a way that minimizes the sum of cumulative reward. Reinforcement Learning techniques involve Value Iteration and Policy Iteration algorithms. Online model-free policy and value iteration algorithms are developed to learn the Nash equilibrium solution for the dynamic graphical game in real-time. Convergence proofs for these adaptive learning techniques are introduced under mild assumption about the inter-connectivity properties of the graph. Actor-critic neural network structures are used to implement the adaptive learning solution of the graphical game. Actor-critic neural networks are temporal difference methods with separate structures that explicitly represent the policies apart, from the value structures. These structures involve forward-in-time algorithms for computing optimal decisions that are implemented online. This type of work brings together Hamiltonian mechanics, cooperative and optimal control, game theory, and machine learning techniques to solve the dynamic graphical games.


Entrée gratuite.
Bienvenue à tous!