Transformers have now been scaled to vast amounts of static data. This approach has been so successful it has forced the research community to ask, “What’s next?”. This workshop will bring together researchers thinking about questions related to the future of language models beyond the current standard model. The workshop is meant to be exploratory and welcome to novel vectors in which new setups may arise, e.g. data efficiency, training paradigms, and architectures.
This workshop is part of the programming for the thematic semester on Large Language Models and Transformers, organized in collaboration with the Simons Institute for the Theory of Computing.
Travel grants are available to attend the event in California.
Workshops will also be available online and live (registration required).
Organizer


Invited Participants
Sanjeev Arora (Princeton University), Kianté Brantley (Harvard University), Danqi Chen (Princeton University), Grigorios Chrysos (University of Wisconsin-Madison), Gintare Karolina Dziugaite (Google DeepMind), Zaid Harchaoui (University of Washington), Elad Hazan (Princeton University), He He (New York University), Andrew Ilyas (Stanford University), Yoon Kim (Massachusetts Institute of Technology), Aviral Kumar (Carnegie Mellon University), Jason Lee (Princeton University), Sewon Min (UC Berkeley), Azalia Mirhoseini (Stanford / DeepMind), Nanyun (Violet) Peng (UCLA), Daniela Rus (MIT), Sasha Rush (Cornell University), Kilian Weinberger (Cornell University), Luke Zettlemoyer (University of Washington), Denny Zhou (Google DeepMind)
Agenda
Monday, Mar. 31st, 2025
9 – 9:30 a.m.: Coffee and Check-In
9:30 – 10:30 a.m.: LLM Reasoning
Denny Zhou (Google DeepMind)
10:30 – 11 a.m.: Break
11 a.m. – 12 p.m.: The Key Ingredients of Optimizing Test-Time Compute and What’s Still Missing
Aviral Kumar (Carnegie Mellon University)
12 – 1:30 p.m.: Lunch (on your own)
1:30 – 2:30 p.m.: Openthinker: Curating a Reasoning Post-Training Dataset and Training Open Data Reasoning Models
Alex Dimakis (UC Berkeley)
2:30 – 3 p.m.: Break
3 – 4 p.m.: LLM Skills and Meta-Cognition: Scaffolding for New Forms of Learning?
Sanjeev Arora (Princeton University)
4 – 5 p.m.: Reception
Tuesday, Apr. 1st, 2025
9 – 9:30 a.m.: Coffee and Check-In
9:30 – 10:30 a.m.: What Will Transformers Look Like In 2027?
Yoon Kim (Massachusetts Institute of Technology)
10:30 – 11 a.m.: Break
11 a.m. – 12 p.m.: Reducing the Dimension of Language: A Spectral Perspective on Transformers
Elad Hazan (Princeton University)
12 – 1:30 p.m.: Lunch (on your own)
1:30 – 2:30 p.m.: Mixed-modal Language Modeling: Chameleon, Transfusion, and Mixture of Transformers
Luke Zettlemoyer (University of Washington)
2:30 – 3 p.m.: Break
3 – 4 p.m.: Talk by
Danqi Chen (Princeton University)
4 – 5 p.m.: Attention to Detail: Fine-Grained Vision-Language Alignment
Kai-Wei Chang (UCLA)
Wednesday, Apr. 2nd, 2025
9 – 9:30 a.m.: Coffee and Check-In
9:30 – 10:30 a.m.: Inference Scaling: A New Frontier for AI Capability
Azalia Mirhoseini (Stanford / DeepMind)
10:30 – 11 a.m.: Break
11 a.m. – 12 p.m.: Talk by
Zaid Harchaoui (University of Washington)
12 – 1:30 p.m.: Lunch (on your own)
1:30 – 2:30 p.m.: Talk by
Dileep George (Google DeepMind)
2:30 – 3 p.m.: Break
3 – 4 p.m.: Talk by
Siva Reddy (IVADO – Mila – McGill University)
Thursday, Apr. 3rd, 2025
9 – 9:30 a.m.: Coffee and Check-In
9:30 – 10:30 a.m.: Advancing Diffusion Models for Text Generation
Kilian Weinberger (Cornell University)
10:30 – 11 a.m.: Break
11 a.m. – 12 p.m.: Controllable and Creative Natural Language Generation
Nanyun (Violet) Peng (UCLA)
12 – 1:30 p.m.: Lunch (on your own)
1:30 – 2:30 p.m.: Transformers Can Learn Compositional Function
Jason Lee (Princeton University)
2:30 – 3 p.m.: Break
3 – 4 p.m.: Predicting and Optimizing the Behavior of Large ML Models
Andrew Ilyas (Stanford University)
4 – 5 p.m.: Panel Discussion
Friday, Apr. 4th, 2025
9 – 9:30 a.m.: Coffee and Check-In
9:30 – 10:30 a.m.: Towards Sequence-to-Sequence Models Without Activation Functions
Grigorios Chrysos (University of Wisconsin-Madison)
10:30 – 11 a.m.: Break
11 a.m. – 12 p.m.: Efficient Policy Optimization Techniques for LLMs
Kianté Brantley (Harvard University)
12 – 1:30 p.m.: Lunch (on your own)
1:30 – 2:30 p.m.: Talk by
Sewon Min (UC Berkeley)
2:30 – 3 p.m.: Break
3 – 4 p.m.: The Future of Language Models: A Perspective on Evaluation
Swabha Swayamdipta (University of Southern California)