Mar 3-5, 2025

Workshop on Theoretical Perspectives on LLMs 

an EnCORE Institute Workshop

The EnCORE Workshop on Theoretical Perspectives on Large Language Models (LLMs) explores foundational theories and frameworks underlying the architecture, learning mechanisms, and capabilities of large language models. This workshop brings together researchers to discuss recent advancements, theoretical challenges, and emerging concepts in understanding and predicting LLM behavior, efficiency, generalization, and alignment with human intent. Topics include mathematical modeling, interpretability, limitations, and innovative theoretical tools to deepen insight into LLM capabilities and constraints. Sponsored by NSF and Google AI.

Confirmed Participants

Josh Alman

Columbia University

Ahmed Beirami

Google Deepmind

Misha Belkin

University of California San Diego

Xiang Cheng 

University of California, Berkeley

Tatsunori Hashimoto 

Stanford University

Hamed Hassani 

University of Pennsylvania

Daniel Hsu 

Columbia University

Adel Javanmard 

University of Southern California

Kangwook Lee 

University of Wisconsin-Madison

Zhiyuan Li 

Toyota Technological Institute at Chicago (TTIC)

Tengyu Ma 

Stanford University

Ankur Moitra 

Massachusetts Institute of Technology

Preetum Nakkiran 

Research Scientist at Apple

Omer Reingold 

Stanford University

Atri Rudra 

University of Buffalo

Sujay Sanghavi 

University of Texas at Austin

Anant Sahai

University of California Berkeley

Vatsal Sharan 

University of Southern California

Christos Thrampoulidis 

University of British Columbia

Yu-Xiang Wang 

University of California San Diego

Schedule (Mar 3-5, 2021)


All times are in Pacific Time (GMT-7). Location: EnCORE Institute, Atkinson Hall: 4th Floor


Mar 3


7:30-8:15 am Breakfast in Hotel


8:45 am  Opening Remarks 


Morning Session: Chair: Sanjoy Dasgupta


9:00 am Kangwook Lee: Beyond Decoder-Only Next Token Prediction


9:45 am Ankur Moitra: Model Stealing for Low Rank Language Models


10:30 - 10:45 am Break


10:45 am Sujay Sanghavi: Mitigating catastrophic forgetting in the data-oblivious setting


11:30 am Tatsu Hashimoto: Statistical perspectives on LLM pretraining data


12:00 - 2:00 pm Lunch


Afternoon Session: Chair: David Woodruff


2:00 pm Daniel Hsu: Transformers, parallelism, and the role of depth


2:45 pm Preetum Nakkiran: What Algorithms can Transformers Learn? A Study in Length Generalization


3:30 pm Hamed Hassani: How to Optimally Quantify Uncertainty for Risk-Averse Agents?


4:00 - 4:15 pm Break


Student Lightning Talks: Chair: Rina Panigrahy


4:15 - 5:00 pm 




Mar 4


7:30-8:30 am Breakfast in Hotel


Morning Session: Chair: Rina Panigrahy


9:00 am Atri Rudra: An Arithmetic Circuit Lens on Deep Learning Architectures


9:45 am Vatsal Saran: Using Algorithms to Understand Transformers (and Using Transformers to Understand Algorithms)


10:30 - 10:45 am Break


10:45 am Rajesh Jayaram: Multi-Vector Representations and Embedding-Based Nearest Neighbor Search


11:30 pm Christos Thrumpoulidis: Implicit Geometry of Next-token Prediction


12:00 - 2:00 pm Lunch


Afternoon Session: Chair: David Woodruff


2:00 pm Josh Alman: Fine-Grained Complexity and the Pursuit of Fast Attention


2:45 pm Tengyu Ma: STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving


3:30 pm Xiang Cheng: Graph Transformers Dream of Electric Flow


4:00 pm - 6:00 pm Hiking/Excursions  



Mar 5


7:30-8:30 am Breakfast in Hotel


Morning Session: Chair: Sanjoy Dasgupta


9:00 am Misha Belkin: The linear representation hypothesis for controlling and steering LLMs


9:45 am Anant Sahai: A Toy Model For Asymptotic Weak to Strong Generalization Leveraging Benign-Overfitting/Harmless-Interpolation Ideas


10:30 - 11:00 am Break


11:00 am Yu-Xiang Wang: Flatness, Sparsity and Generalization by Large-Learning Rate


11:30 am Zhiyuan Li: Weak-to-Strong Generalization Even in Random Feature Networks, Provably


12:00 - 2:00 pm Lunch


Afternoon Session: Chair: Arya Mazumdar


2:00 pm Ankit Singh Rawat: A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs


2:30 pm Adel Javanmard: DeepCrossAttention: Supercharging Transformer Residual Connections


3:00 pm Ahmad Beirami: Language Model Alignment: Theory & Practice


3:30 - 3:45 pm Break


3:45 pm Open Problems Session: Chair: Arya Mazumdar






Logistics

Location

We are hosted by the Institute for Emerging CORE Methods in Data Science (EnCORE) at University of California San Diego, on the 4th floor of Atkinson Hall. The address is:

3235 Voigt Dr, La Jolla, CA 92093

The recommended rideshare drop-off location is Parking Lot P503. Please do not park in this parking lot as you will likely be ticketed and/or towed.

Parking

UCSD offers both metered spaces and permit-only parking lots. Parking can be limited, so we encourage the use of public transportation when possible.

The Hopkins Parking Structure is located at 10100 Hopkins Dr., La Jolla, CA 92093. Please allow 20-30 minutes to park and walk to the event venue.

Lodging

Attendees can book a nearby AirBnB, or the closest hotels are:

Organizers

Sanjoy Dasgupta 

University of California San Diego

Arya Mazumdar

University of California San Diego

Rina Panigrahy

Google Research

Barna Saha

University of California San Diego

David Woodruff

Carnegie Mellon University