Develop truly robust and capable agents, able to interact, avoid exploitation and find pro-social solutions.
Vision
Envisioning a future with an internet populated by large language model (LLM)-based agents and roads filled with autonomous vehicles requires consideration of how these agents will interact and the nature of these interaction systems. This is why the vision of Regroupement 2 (R2) is to address and explore the challenges that emerge in multi-agent interactions, contributing to their safe and ethical fruition.
Objectives
- Leverage LLMs to develop and deploy AI agents that make decisions and take actions on behalf of individuals and companies.
- Ensure the robustness and reliability of LLM-based agents for successful deployment.
- Improve the ability of LLM-based agents to interact effectively with humans and other agents.
Research Axes
Axis 1: Foundations for Multi-agent AI systems
Contribute to the fundamental topics that are reinforcement learning, world-model building, and game theory to improve AI agent interactions. Focus on building better world-models by exploring causal reasoning, enhancing uncertainty estimation, and using LLMs and other foundation models like vision language models (VLMs) as interpreters with common-sense reasoning abilities.
Axis 2: Advancing multi-agent cooperation through better learning algorithms
This axis explores the development of multi-agent reinforcement learning (RL) algorithms, focusing on general-sum cases where agents are neither purely competitive nor cooperative. In these settings, game theoretic considerations become critical to the evolution of the interaction dynamic. The focus will be on developing novel opponent shaping methods that take into account other agents and their interests when learning how to interact.
Axis 3: Multi-agent communication and LLMs
This axis examines the development and learning of multi-agent communication strategies and their role in agent coordination and manipulation. It focuses on both (1) fundamental notions of communication among agents and (2) the use of natural language by LLM-endowed agents and their interaction.
Axis 4: Multi-agent world modeling
This axis aims to explore the potential advantages of endowing agents with the ability to explicitly model their environment, including the beliefs and intentions (i.e., a theory of mind) of other agents co-existing within the environment.
Challenges
Uncertainty and generalizability: One grand challenge in creating useful real-world agents is to construct them such that they can solve a breadth of tasks when provided with unstructured commands. This challenge hinges on the generalizability of the trained agent.
Adaptability to different environments: Our application areas are diverse, spanning healthcare, drug discovery, HVAC, autonomous driving, and LLM-based web agents interacting with people, where agents must be capable of interacting effectively in all these complex environments.
Social and societal implications: Ensuring mechanisms that encourage agents to balance self-interest with social good.
Anticipated Impact
- Have a long-term impact on the foundations for interacting with AI agents.
- Develop generalist learning and planning systems that can rapidly assist people with any task.
- Improve multi-agent interactions in scenarios where agents have conflicting goals.
- Foster collaboration and scientific reproducibility by publicly releasing our code and research.
- Facilitate technology transfers to the industry.
Researchers
- Tal Arbel – McGill University
- Pierre-Luc Bacon – Université de Montréal
- Yoshua Bengio – Université de Montréal
- Glen Berseth – Université de Montréal
- Audrey Durand – Université Laval
- Christian Gagné – Université Laval
- Foutse Khomh – Polytechnique Montréal
- Simon Lacoste-Julien – Université de Montréal
- Guillaume Lajoie – Université de Montréal
- Chris Pal – Polytechnique Montréal
- Courtney Paquette – McGill University
- Oiwi Parker Jones – Jesus College Oxford
- Siva Reddy – McGill University
- Dhanya Sridhar – Université de Montréal
- David A. Stephens – Université McGill
Research Advisor
Jacqueline Sanchez: jacqueline.sanchez@ivado.ca