The Future of Language Models and Transformers

Les transformeurs ont été adaptés à des quantités massives de données statiques. Cette méthode a connu un tel succès qu’elle a poussé la communauté de recherche à se questionner sur les prochaines étapes. Cet atelier rassemblera des chercheurs qui explorent les questions relatives à l’avenir des modèles de langage au-delà du modèle standard actuel. L’objectif de cet atelier est d’être exploratoire et d’accueillir de nouvelles approches dans lesquelles de nouveaux paradigmes pourraient émerger, tels que l’efficience des données, les paradigmes d’entraînement et les architectures.

*Cet atelier sera exclusivement en anglais.

L’activité fait partie de la programmation du semestre thématique intitulé « Les grands modèles de langage et les transformeurs » organisé en collaboration avec le Simons Institute for the Theory of computing.

Des bourses de voyage sont disponibles pour assister à l’événement en Californie.

Les ateliers seront aussi disponibles en ligne et en direct (sur inscription seulement).

Coorganisation scientifique

Sasha Rush (Cornell University; chair)

Swabha Swayamdipta (University of Southern California (USC))

Participant(e)s invité(e)s

Sanjeev Arora (Princeton University), Kianté Brantley (Harvard University), Danqi Chen (Princeton University), Grigorios Chrysos (University of Wisconsin-Madison), Gintare Karolina Dziugaite (Google DeepMind), Zaid Harchaoui (University of Washington), Elad Hazan (Princeton University), He He (New York University), Andrew Ilyas (Stanford University), Yoon Kim (Massachusetts Institute of Technology), Aviral Kumar (Carnegie Mellon University), Jason Lee (Princeton University), Sewon Min (UC Berkeley), Azalia Mirhoseini (Stanford / DeepMind), Nanyun (Violet) Peng (UCLA), Daniela Rus (MIT), Sasha Rush (Cornell University), Kilian Weinberger (Cornell University), Luke Zettlemoyer (University of Washington), Denny Zhou (Google DeepMind)

Programme de l’événement

Lundi 31 mars 2025

9:00 – 9:30 : Accueil et café
9:30 – 10:30 : LLM Reasoning
    Denny Zhou (Google DeepMind)
10:30 – 11:00 : Pause
11:00 – 12:00 : The Key Ingredients of Optimizing Test-Time Compute and What’s Still Missing
    Aviral Kumar (Carnegie Mellon University)
12:00 – 13:30 : Lunch (non fourni)
13:30 – 14:30 : Openthinker: Curating a Reasoning Post-Training Dataset and Training Open Data Reasoning Models
    Alex Dimakis (UC Berkeley)
14:30 – 15:00 : Pause
15:00 – 16:00 : LLM Skills and Meta-Cognition: Scaffolding for New Forms of Learning?
    Sanjeev Arora (Princeton University)
16:00 – 17:00 : Réception

Mardi 1^er avril 2025

9:00 – 9:30 : Accueil et café
9:30 – 10:30 : What Will Transformers Look Like In 2027?
    Yoon Kim (Massachusetts Institute of Technology)
10:30 – 11:00 : Pause
11:00 – 12:00 : Reducing the Dimension of Language: A Spectral Perspective on Transformers
    Elad Hazan (Princeton University)
12:00 – 13:30 : Lunch (non fourni)
13:30 – 14:30 : Mixed-Modal Language Modeling: Chameleon, Transfusion, and Mixture of Transformers
    Luke Zettlemoyer (University of Washington)
14:30 – 15:00 : Pause
15:00 – 16:00 : Talk by
    Danqi Chen (Princeton University)
16:00 – 17:00 : Attention to Detail: Fine-Grained Vision-Language Alignment
    Kai-Wei Chang (UCLA)

Mercredi 2 avril 2025

9:00 – 9:30 : Accueil et café
9:30 – 10:30 : Inference Scaling: A New Frontier for AI Capability
    Azalia Mirhoseini (Stanford / DeepMind)
10:30 – 11:00 : Pause
11:00 – 12:00 : Talk by
    Zaid Harchaoui (University of Washington)
12:00 – 13:30 : Lunch (non fourni)
13:30 – 14:30 : Talk by
    Dileep George (Google DeepMind)
14:30 – 15:00 : Pause
15:00 – 16:00 : Talk by
    Siva Reddy (IVADO – Mila – McGill University)

Jeudi 3 avril 2025

9:00 – 9:30 : Accueil et café
9:30 – 10:30 : Advancing Diffusion Models for Text Generation
    Kilian Weinberger (Cornell University)
10:30 – 11:00 : Pause
11:00 – 12:00 : Controllable and Creative Natural Language Generation
    Nanyun (Violet) Peng (UCLA)
12:00 – 13:30 : Lunch (non fourni)
13:30 – 14:30 : Transformers Can Learn Compositional Function
    Jason Lee (Princeton University)
14:30 – 15:00 : Pause
15:00 – 16:00 : Predicting and Optimizing the Behavior of Large ML Models
    Andrew Ilyas (Stanford University)
16:00 – 17:00 : Discussion Panel

Vendredi 4 avril 2025

9:00 – 9:30 : Accueil et café
9:30 – 10:30 : Towards Sequence-to-Sequence Models without Activation Functions
    Grigorios Chrysos (University of Wisconsin-Madison)
10:30 – 11:00 : Pause
11:00 – 12:00 : Eﬃcient Policy Optimization Techniques for LLMs
    Kianté Brantley (Harvard University)
12:00 – 13:30 : Lunch (non fourni)
13:30 – 14:30 : Talk by
    Sewon Min (UC Berkeley)
14:30 – 15:00 : Pause
15:00 – 16:00 : The Future of Language Models: A Perspective on Evaluation
    Swabha Swayamdipta (University of Southern California)

Retour à tous les événements