Schedule

This hybrid workshop will be held on Saturday, May 11th, 2024.

All times are in local Vienna time, which is UTC+2.

08:50--09:00 Opening remark

09:00--09:40 Invited Talk (In-person): Sasha Rush

Title: SSMs Change The Foundation Model Design Space
Abstract: Recent work has shown that linear RNN models such as SSMs show competitive scaling properties with Transformers in several modalities. Much of the conversation has focused on whether these models can "beat" Transformers in a variety of different tasks. Instead, in this talk we consider two applications where SSM architectures open up new model design possibilities. We first give a quick tutorial of the architecture, focusing on its scaling. Next we discuss the implications of these results, focusing on building efficient byte-level language models. Finally we discuss SSMs for non-language foundation models using long-range linearization in image generation.

09:40--10:20 Invited Talk (In-person): Yuandong Tian

Title: Demystifying Self-Attention Mechanism in Feature Composition and Logic Reasoning via the Lens of Training Dynamics
Abstract: Large Language Models (LLMs) have demonstrated remarkable efficacy across diverse applications. Its core design, the multi-layer Transformer and the self-attention mechanism, is believed to play a pivotal role from empirical evidences, but remains a mystery why and how they work to find a better representation. In this talk, I will introduce our analysis on the training dynamics of self-attention and project-in MLP layer in a mathematically rigorous manner. Such an analysis provides hypothesis on why attentions become sparse and how tokens can be combined automatically to learn latent hierarchy in the data. Furthermore, we also explain why reversal curse (i.e., learning "A->B" does not lead to "B->A") happens, and why Chain-of-Thoughts (CoT) are needed when applying Transformers on logical reasoning tasks, from the training dynamics perspective. Insights such as contextual sparsity and low-rankness of gradients leads to novel approaches for more efficient pre-training and fine-tuning approaches for LLMs, such as Deja Vu, H2O, StreamingLLM and GaLore.

10:20--11:00 Invited Talk (In-person): Hannaneh Hajishirzi

Title: OLMo: Accelerating the Science of Language Modeling
Abstract: Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. To this end, this talk details the first release of OLMo, a state-of-the-art, truly Open Language Model and its framework to build and study the science of language modeling. Unlike most prior efforts that have only released model weights and inference code, we release OLMo and the whole framework, including training data and training and evaluation code. We hope OLMo will empower and strengthen the open research community and inspire a new wave of innovation.

11:00--12:00 Spotlight Talks (In-person, 5 min talk + 3min Q/A)

Prompting a Pretrained Transformer Can Be a Universal Approximator by Aleksandar Petrov, Adel Bibi, Philip Torr
Uncovering Mesa-Optimization Algorithms in Transformers by Johannes Von Oswald, Eyvind Niklasson, Maximilian Schlegel, Alexander Meulemans, Seijin Kobayashi, Nicolas Zucchet, Nino Scherrer, Nolan Andrew Miller, Mark Sandler, Blaise Aguera y Arcas, Max Vladymyrov, Razvan Pascanu, Joao Sacramento
Massive Activations in Large Language Models by Mingjie Sun, Xinlei Chen, J Zico Kolter, Zhuang Liu
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning by Simon Schrodi, David T Hoffmann, Max Argus, Volker Fischer, Thomas Brox
Selecting Large Language Model to Fine-tune via Rectified Scaling Law by Haowei Lin, Baizhou Huang, Haotian Ye, Qinyu Chen, Zihao Wang, Sujian Li, Jianzhu Ma, Xiaojun Wan, James Zou, Yitao Liang
Scaling Laws for Fine-Grained Mixture of Experts by Jan Ludziejewski, Jakub Krajewski, Kamil Adamczewski, Maciej Pióro, Michał Krutul, Szymon Antoniak, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Piotr Sankowski, Marek Cygan, Sebastian Jaszczur
Iterative Preference Learning from Human Feedback: Bridging Theory and Practise for RLHF under KL-constraint by Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang

12:00--13:00 Lunch Break

13:00--14:00 Poster Session

14:00--15:00 Panel Discussion (In-person)

With Sasha Rush, Yuandong Tian, Hanna Hajishirzi, Jacob Steinhardt, and Aditi Raghunathan (moderator)

Title: Interpretability via Decomposition
Abstract: Understanding the latent representations of models can help us to anticipate and correct model failures, and to elicit capabilities that are not already present in the outputs. However, to unlock this potential, we need scalable methods for interpreting the representations of billion-parameter models. In this talk, I'll describe an approach for interpreting complex models by mathematically decomposing them into simpler components, focusing on the CLIP image encoder. We decompose CLIP's image representation as a sum across individual image patches, model layers, and attention heads / neurons, and use CLIP's text representation to interpret the summands. Interpreting the attention heads, we characterize each head's role by automatically finding text representations that span its output space, which reveals property-specific roles for many heads (e.g. location or shape). Interpreting the MLP layers, we find both superposition and sparsity: each neuron responds to a small number of distinct concepts. Using our understanding, we produce a state-of-the-art zero-shot image segmenter, remove spurious cues, and mass-produce "semantic" adversarial examples. Our results indicate that a scalable understanding of transformer models is attainable, and that this understanding can be grounded in auditing, repairing, and improving models. Joint work with Yossi Gandelsman and Alyosha Efros.

Title: Understanding Complex Processing in Temporal Models - From Implicit Biases to Mechanistic Interpretability
Abstract: Recent temporal models such as transformers and RNN variants can effectively capture complex structure in text. However, it remains largely unknown how this is achieved. The talk will discuss our work on this problem. First, I will discuss work demonstrating implicit biases of RNNs, showing that they have a bias towards learning "simple" rules that correspond to dynamic systems with a low dimensional state. I will also discuss recent work on implicit bias in RL. I will then present several works on understanding how transformers solve complex problems. First, I will discuss work that dissects how transformers extract information about the world (e.g., "Paris is the capital of France"), highlighting several information processing streams that underlie this process. Finally, I will discuss our analysis of how transformers achieve in-context learning, and the internal hypothesis space used in this process.