Moderator
Professor \\
Max Planck Centre for Biological Cybernetics
Speaker
Simons Postdoctoral Fellow\\
MIT
Title: Building a causal model of other minds
Abstract: Model-free reinforcement learning (RL) can capture adaptation across many different species and domains offering a biologically plausible mechanism for incremental learning, so a natural question is whether RL can capture human social learning dynamics. In this talk, I’ll show that while RL is a critical component of social learning and is robustly reflected in ventral striatum signals during social interaction, variability in performance across social vs. nonsocial tasks, and across individuals, does not arise from standard RL parameters. Rather, learning differences arise from the kinds of causal attributions people make, which is instead related to differences in credit assignment. I then discuss Bayesian theory of mind (ToM) models, which make explicit computational assumptions about the architecture of other minds by linking observed behavior to an expressible theory of why someone behaved the way they did.
Link to website: https://www.amritalamba.com/
Speaker
Assistant Professor \\
University of Maryland College Park
Title: Neural computations underlying arbitration between observational and experiential learning
Abstract: Navigating our complex social world requires integrating between multiple sources of information. In this study, we show that human learners flexibly combine information from others’ behavior (observational RL) and from direct trial-and-error (experiential RL) based on each source’s relative reliability. Prediction error signals related to each strategy were represented in distinct brain regions: observational state prediction error in dorso-medial and dorso-lateral prefrontal cortex, and experiential reward prediction error in ventral striatum and ventro-medial prefrontal cortex (vmPFC). Furthermore, an integrated decision signal, combining decision values from both strategies, was tracked in vmPFC and superior temporal gyrus. These findings suggest that the brain represents and weighs learning strategies across domains to guide flexible decision-making in uncertain social environments.
Link to lab website: https://sldlab.umd.edu/
Speaker
Associate Professor \\
Yale University
Title: What Primates Know About Other Minds and When They Use It: A Computational Approach to Understanding Theory of Mind in Other Species
Abstract: Modeling social life requires not just capturing how humans reason about minds but also understanding how this capacity differs from those of our closest evolutionary relatives. I will present a computational theory-testing approach for assessing Theory of Mind across species, focusing on non-human primates (NHP). This involves modeling both the types of social representations NHPs might have, and how they are used in specific tasks. By formalizing competing theories, we find that only models positing some representation of other minds explains NHP behavior--though these representations are used far less frequently than in humans. This framework advances cross-species modeling of social inferences, and clarifies both shared and distinctive aspects of humans social reasoning.
Link to lab website: https://compdevlab.yale.edu
Speaker
PhD student \\
MIT
Title: MEMO: a probabilistic programming language for reasoning about reasoning
Abstract: In this talk, I will show you a new probabilistic programming language, memo, for building Bayesian models of recursive social reasoning. memo (https://github.com/kach/memo) is designed from the ground up around the conceptual vocabulary of theory-of-mind (agents, choices, beliefs), and it compiles directly to fast GPU code. As a result, finicky models (e.g. POMDPs, inverse planning, cognitive hierarchy, …) can be implemented in just a few lines of code, and they run several orders of magnitude faster than expert-written implementations. memo has been used in several published papers across a variety of disciplines, and used to teach college courses and workshops around the world.
Link to website: https://cs.stanford.edu/~kach/
Speaker
PhD Student \\
Hebrew University in Jerusalem
Title: Theory of Mind – A Double-Edged Sword
Abstract: Theory of Mind (ToM)—the uniquely human capacity for recursively simulating others’ mental states—plays a pivotal role in navigating complex social interactions. While previous work shows that most cooperative tasks require only shallow recursion, this appears to contradict our capacity for deeper recursive reasoning. In this work, I argue that it is competitive, rather than cooperative, scenarios that demand deep recursion. I begin by examining how agents exploit ToM to deceive others, sparking a cognitive arms race. However, I contend that the overuse of ToM—referred to as overmentalization—can be counterproductive, and even harmful, when misapplied. This is joint work with Joe Barnby, Lion Schulz, Jeffrey Rosenschein, and Peter Dayan.
Speaker
Assistant Professor & Staff Res. Scientist \\
University of Washington | Google DeepMind
Title: Social Reinforcement Learning for Large Language Models
Abstract: Multi-agent Reinforcement Learning Fine-Tuning of Large Language Models (LLMs) can enable continuous self-improvement and provably robust LLMs. We introduce a self-play safety game, where an attacker and defender LLM co-evolve through a zero-sum adversarial game. The attacker attempts to find prompts which elicit an unsafe response from the defender, as judged by a reward model. Both agents use a hidden chain-of-thought to reason about how to develop and defend against attacks. Using well-known game theoretic results, we show that if this game converges to the Nash equilibrium, the defender’s will output a safe response for any string input. Empirically, we show that our approach produces a model that is safer than models trained with RLHF, while retaining core chatting and reasoning capabilities.
Personal Website: https://natashajaques.ai/
Lab Website: https://socialrl.cs.washington.edu/
Speaker
Professor \\
Hebrew University in Jerusalem
Title: You be the actor, I'll be the critic: the emergence of social norms
Abstract: How do social norms emerge in groups of interacting individuals? We present a unified framework—the interindividual actor-critic—that extends actor-critic RL to social settings. Building on the idea that others, like ourselves, critique our actions, we simulate how social feedback shapes behavior over time. This simple extension leads to group dynamics that mirror real-world norms, capturing both why certain kinds of norms tend to emerge (i.e., prosocial and ingroup favoring norms) and how they persist or change (e.g., stickiness, self-reinforcement). Our model bridges individual learning and collective behavior, offering a parsimonious account of norm emergence grounded in the cognitive science of learning and emotions.
Link to lab website: https://sites.google.com/site/eldareran