Past Seminars


Jon Kleinberg (Cornell University), Nov. 19, 2021

Title: Aligning Superhuman AI with Human Behavior: Chess as a Model System

Abstract: In domains where AI systems have achieved superhuman performance, there is an opportunity to study the similarities and contrasts between human and AI behaviors at the level of granular decisions, not just aggregate performance. Such an analysis can yield several potential sources of insight. First, by studying expert-level human decisions through the lens of systems that far surpass this performance, we can try to characterize the settings in which humans errors are most likely to occur. Second, we can try to adapt high-performing AI systems to match human behavior as closely as possible at a decisin-by-decision level,

We pursue these goals in a domain with a long history in AI: chess. For our purposes, chess provides a domain with many different levels of human expertise, together with data from hundreds of millions of online games that each record a specific decision together with its context. However, applying existing chess engines to this data, including an open-source implementation of the AlphaZero system, we find that they do not predict human moves well.

We develop new methods for predicting human decisions at a move-by-move level much more accurately than existing engines, and in a way that is tunable to fine-grained differences in human skill. From this, we discover aspects of chess positions that serve as predictors of human error, as well as algorithms that are able to operate in this domain in a more "human-like" way. One of our algorithms, the Maia engine, can be tried at lichess.org (https://lichess.org/@/maia1), where it has played over a half-million games to date with users of the platform.

This is joint work with Ashton Anderson, Reid McIlroy-Young, Sendhil Mullainathan, Siddhartha Sen, and Russell Wang.

Aislinn Bohren (UPenn), Nov. 12, 2021

Title: Representing Heuristics as Misspecified Models

Abstract: A growing literature in economics considers how to model heuristics that agents use to process information and the resulting biases that emerge in belief updating. In this paper, we link two common approaches used to model inaccurate belief updating: (i) defining an ``updating rule" that specifies a mapping from the true Bayesian posterior to the agent's subjective posterior, and (ii) modeling an agent as a Bayesian with a misspecified model of the signal process. We first establish conditions under which an updating rule can be represented as a misspecified model and vice versa. In fact, there are often many misspecified models that represent a given heuristic. We then outline two natural restrictions to place on the misspecified model that provide conceptual guidance for which model to select. The first condition---introspection-proofnes---imposes structure on how the misspecified model relates to the correctly specified model. The second condition---naive consistent forecasting---is a condition on the agent's belief about how he will form beliefs in the future. Each condition uniquely selects a misspecified model when such a model exists.

This work is joint with Daniel N. Hauser.

Saurabh Amin (MIT), Oct. 29, 2021

Title: Strategic Traveler Decisions under Platform-enabled Information and Learning in Transportation Systems

Abstract: This talk is motivated by the increasing role that today’s data-rich platforms play in shaping the usage of urban transportation systems. We focus on game-theoretic analysis of the impact of information and learning enabled by these platforms on the strategic behavior of travelers.

Firstly, we analyze the impact of multiple heterogeneous information platforms on the routing decisions of travelers who face uncertainty about the underlying network state. We present an approach to evaluate the effects of information on travelers’ equilibrium route choices and congestion costs. Particularly, we provide sharp conditions on the population sizes for which the value of information provided by a service is positive, zero, and negative. Our approach captures the key-tradeoff between the gain of information about the network state and the congestion externality due to aggregate choices of travelers.

Secondly, we study a dynamic setting in which a public information platform updates the estimate of network state based on available data on edge loads and costs. Here the travelers learn and adapt their routing decisions based on the up-to-date information received from the platform. The long-term behavior of the resulting stochastic learning dynamics is based on endogenous and non i.i.d. data. We present new results on the convergence and stability of such learning dynamics and develop conditions for convergence to complete information equilibrium.

Our results provide some guidance on the design of optimal information structures and adaptive tolling mechanisms.

This work is joint with Manxi Wu (Berkeley EECS and Cornell ORIE) and Asu Ozdaglar (MIT EECS and LIDS).

Katrina Ligett (Hebrew University), Oct. 22, 2021

Title: Gaming Helps! Learning from Strategic Interactions in Natural Dynamics

Abstract: Those who are being classified are sometimes aware of the classifiers that are being used on them, and may have incentive to change their behavior to try to improve the label they receive. The attitude towards such strategic behavior, both in practice and in theoretical treatment, has generally been quite negative, and this is one of the reasons that the internal workings of high-stakes classifiers are often shrouded in secrecy.

However, intuitively, agents who strategically change their behavior in response to incentives set by classifiers may actually be doing everyone a favor: they are helping teach the classifier whether the variables that the classifier depends on are truly meaningful for the task at hand---that is, features that, when changed, affect the true label (as opposed to non-meaningful features that have no effect). Over time, this could push the system to develop better classifiers, and could push individuals to invest more in meaningful variables. We study this issue in an online regression setting.

Joint work with Yahav Bechavod, Zhiwei Steven Wu, and Juba Ziani. Work appeared at AISTATS'21.

Asuman Ozdaglar (MIT), Oct. 15, 2021

Title: Dynamic Multi-Agent Learning: Where Artificial Intelligence Meets Game Theory

Abstract: Many Artificial Intelligence (AI) applications are increasingly relying on multi-agent learning methods in order to develop decision rules that perform well in dynamic and uncertain settings with multiple decision-makers (such as in autonomous vehicles, recommendation systems, and financial trading). Despite the importance of multi-agent learning problems both in game theory and now increasingly in AI, there are few results that ensure learning in decentralized and dynamic environments.

In this talk, I present a new approach that provides sharp learning results in certain dynamic multi-agent settings. The approach we develop is based on fictitious play dynamics, but extended in three important aspects. First, it is applied to dynamic stochastic games. Second, it simultaneously enables agents to learn about their own continuation payoffs and other players' strategies. Third, and most importantly, it incorporates two-timescale learning, whereby agents update their beliefs about others' actions more frequently than their beliefs about their continuation payoffs. We prove that this approach enables decentralized learning in zero-sum stochastic games and entails no coordination in the learning behavior of different agents. This presents for the first time a “completely independent” learning approach rather than the coordinated or semi-independent learning methods considered in some recent work. We extend these results to model-free learning settings, where players do not know their own stage payoff function and transition probabilities, and minimal information settings with payoff-based dynamics, where they do not even observe the other player’s action, but learn from only their realized payoffs.

Claire Tomlin (UC Berkeley), Oct. 1, 2021

Title: Modeling other Agents

Abstract: One of the biggest challenges in the design of autonomous systems is to effectively predict what other agents in the vicinity will do. If the goal is to preserve safety, the assumption that other agents take their most unsafe action leads to a zero-sum game formulation. In collision avoidance applications, for example, if I make this assumption about other agents, and then manage to stay outside of their reachable sets, then I am guaranteed to be safe. In this talk, we explore how this worst case assumption may be relaxed, and present game-theoretic motion planning results which use feedback Nash equilibrium problems to model interaction between agents. We demonstrate our results on both simulations and experiments of multiple vehicle scenarios.

Drew Fudenberg (MIT), Sep. 10, 2021

Title: Which Misperceptions Persist

Abstract: We use an evolutionary model to determine which misperceptions can persist. Every period, a new generation of agents use their subjective models and the data generated by the previous generation to update their beliefs about some parameters, and models that induce better actions increase their prevalence. Mutations that lead a small fraction of agents to use a more permissive but still misspecified subjective model fail to spread if inference from equilibrium data leads the mutated agents to take worse actions.

We characterize which steady states resist mutations to a nearby model, and which resist mutations that drop a qualitative restriction such as independence.

Joint work with Giacomo Lanzani.

Na Li (Harvard University), June 11, 2021

Title: Gradient Play in Multi-Agent Stochastic Games: Stationary Points and Convergence

Abstract: We study the performance of the gradient play algorithm for multi-agent tabular Markov decision processes (MDPs), which are also known as stochastic games (SGs), where each agent tries to maximize its own total discounted reward by making decisions independently based on current state information which is shared between agents. Policies are directly parameterized by the probability of choosing a certain action at a given state. We show that Nash equilibria (NEs) and first order stationary policies are equivalent in this setting, and give a non-asymptotic global convergence rate analysis to an epsilon-NE for a subclass of multi-agent MDPs called Markov Potential Games (MPG), which includes the cooperative setting with identical rewards among agents as an important special case. Our result shows that the number of iterations to reach an epsilon-NE scales linearly, instead of exponentially, with the number of agents. Local geometry and local stability are also considered. We also show that strict NEs are local maxima of the total potential function and fully-mixed NEs are saddle points. Lastly, we give a local convergence rate around strict NEs for more general settings.

Joint work with Runyu (Cathy) Zhang and Zhaolin Ren.

Sanjeev Goyal (U of Cambridge), May 7, 2021

Title: Connectors and Influencers

Abstract: Classical models of network formation bring out the salience of a ‘law of the few’ property, that is manifest in hub-spoke/star architectures. Existing experiments on network formation find that subjects do not create such networks. Our paper conducts a network formation experiment on the model of Galeotti and Goyal (2010). The theory predicts that every equilibrium of this game is a ‘star’ network in which the spokes pay for links with a single hub. There are two equilibrium effort configurations: the center makes all the effort (the pure influencer outcome) and the hub makes zero effort (the pure connector outcome). This paper tests these predictions with the help of a new experimental platform with asynchronous activity in continuous time. We vary group size and provision of information of others’ payoffs. Subjects always create networks with specialization in linking. This is consistent with equilibrium prediction. Our second result concerns the interaction of group size and information provision. In a baseline information treatment where subjects only see their own payoffs, they select the pure influencer outcome. By contrast, when we provide information on everyone’s payoffs, in large groups, subjects select a pure connector outcome. These behavioural patterns can be accounted for by a decision rule on activity level that combines myopic best response and competition for hub status.

Jason R. Marden (UC Santa Barbara), Apr. 23rd, 2021

Title: Mechanism Design for Multiagent Coordination

Abstract: The goal in any multiagent system is to derive desirable collective behavior through the design of admissible control algorithms. Here, the admissibility of a given control algorithm is often predicated on the information that is available. This talk will cover three recent results that focus on characterizing how the level of informational availability impacts the achievable performance guarantees in multiagent coordination problems. Our first result focuses on the design of incentive mechanisms to influence and improve societal behavior in congestion games. Here, we will demonstrate how information pertaining to both the infrastructure and population impacts the ability of a societal planner to successfully influence the collective behavior. Our second result shifts attention from influencing societal systems to designing networked engineering systems. Here, we will focus on characterizing how a limited degree of inter-agent communication can be exploited to significantly improve the efficacy of the resulting stable behavior for a class of distributed submodular optimization problems. Lastly, if time permits we will focus on a class of strategic resource allocation games (Colonel Blotto games) and demonstrate how revealing private information can be strategically advantageous in competitive scenarios.

Michael I. Jordan (UC Berkeley), Apr. 9th, 2021

Title: On Decision-Making in Machine Learning and Game Theory

Abstract: Statistical decisions are often given meaning in the context of other decisions, particularly when there are scarce resources to be shared. Managing such sharing is one of the classical goals of microeconomics, and it is given new relevance in the modern setting of large, human-focused datasets, and in data-analytic contexts such as classifiers and recommendation systems. I'll discuss several recent projects that aim to explore this interface, including the study of exploration-exploitation tradeoffs for bandits that compete over a scarce resource, notions of local optimality in nonconvex-nonconcave minimax optimization and how such notions relate to stochastic gradient methods, and the use of Langevin-based algorithms for Thompson sampling.

Naomi E. Leonard (Princeton), Apr. 2nd, 2021

Title: Nonlinear Opinion Dynamics and Games


Abstract: I will discuss a recently proposed model of opinion dynamics and its use in the study of multi-agent strategic interaction. Our model describes continuous-time opinion dynamics for an arbitrary number of agents that communicate over a network and form real-valued opinions about an arbitrary number of options. Many models in the literature update agent opinions using a weighted average of their neighbors’ opinions. Our model generalizes these models by applying a sigmoidal saturation function to opinion exchanges. This makes the update fundamentally nonlinear: opinions form through a bifurcation yielding multi-stability of network opinion configurations. Leveraging idealized symmetries in the system, qualitative behavioral regimes can be distinguished explicitly in terms of just a few parameters. I will show how to interpret the opinion dynamics as a model of multi-agent reinforcement learning in games and discuss implications for games including the traveler’s dilemma and the prisoner’s dilemma.

Modeling and analysis of opinion dynamics is joint work with Anastasia Bizyaeva and Alessio Franci and based on the paper https://arxiv.org/abs/2009.04332v2. Using the model in games is joint work with Shinkyu Park, Anastasia Bizyaeva, and Yunxiu Zhou.

Nicole Immorlica (MS Research), Mar. 19th, 2021

Title: Persuading with Anecdotes


Abstract: This talk presents a model of learning and communication between two agents using hard anecdotal evidence. We use this model to shed new light on human communication and justify when and why polarization and bias may arise. We consider two agents, a sender and a receiver, who have a prior over an unknown state of the world and must take actions whose payoffs are determined by their personal preferences as well as the state of the world. The sender has observed a set of anecdotes, that is a collection of observations from the state of the world, and can send one these anecdotes to influence the receiver’s choice of action. We show that if the sender’s communication scheme is observable by the receiver (that is, the choice of which anecdote to send given the set of anecdotes), then no matter the difference in preferences, the agents use an unbiased and maximally informative communication scheme. Without observability, however, even a small difference in preferences can lead to significant bias in the choice of anecdotes, which the receiver must then account for. This significantly reduces the informativeness of the signal, leading to substantial utility loss for both sides. One consequence of this is informational homophily: a receiver can rationally prefer to obtain information from a poorly-informed sender with aligned preferences, rather than a knowledgeable expert whose preferences may differ from her own.


This is based on joint work with Nika Haghtalab, Brendan Lucier, Markus Mobius, and Divyarthi Mohan.

Azarakhsh Malekian (UToronto), Mar. 5th, 2021

Title: Too Much Data: Prices and Inefficiencies in Data Markets

No Recording

Abstract: When a user shares her data with an online platform, she typically reveals relevant information about other users. We model a data market in the presence of this type of externality in a setup where one or multiple platforms estimate a user’s type with data they acquire from all users and (some) users value their privacy. We demonstrate that the data externalities depress the price of data because once a user’s information is leaked by others, she has less reason to protect her data and privacy. These depressed prices lead to excessive data sharing. We characterize conditions under which shutting down data markets improves (utilitarian) welfare. Competition between platforms does not redress the problem of excessively low price for data and too much data sharing, and may further reduce welfare. We propose a scheme based on mediated data-sharing that improves efficiency.


This is based on joint work with Daron Acemoglu, Ali Makhdoumi, and Asu Ozdaglar.

Rakesh Vohra (Upenn), Feb. 19th, 2021

Title: Contagion and Equilibria in Diversified Financial Networks

Abstract: Diversified cross-shareholding networks are thought to be more resilient because of diversification, but diversification also increases the channels by which a shock can spread. To resolve these competing intuitions we introduce a stochastic model of a diversified cross-shareholding network in which a firm's valuation depends on its cash endowment and the shares it owns in other firms. If a firm's valuation falls below a failure threshold, it discontinuously imposes losses on its counter-parties, e.g., as default costs. We show that very different realized network instances will exhibit similar equilibrium behavior allowing us to examine a wide range of cross-shareholding networks simultaneously. The equilibrium profiles of firm valuations are distributed non-uniformly in space: while there is concentration of equilibria around the ``worst'' and the ``best'' equilibria, there are many equilibria ``in-between'', suggesting that the extreme equilibria---focused on in other studies of similar models---are unrepresentative.


This is based on joint work with Victor Amelkin and Santosh Venkatesh.


Aaron Roth (Upenn), Feb. 5th, 2021

Title: Online Multivalid Learning: Game Theory for Better Estimation

Abstract: We present a general, efficient technique for providing contextual predictions that are “multivalid” in various senses, against an online sequence of adversarially chosen examples (x, y). This means that the resulting estimates correctly predict various statistics of the labels y not just marginally — as averaged over the sequence of examples — but also conditionally on x \in G for any G belonging to an arbitrary intersecting collection of groups. We provide three instantiations of this framework. The first is mean prediction, which corresponds to an online algorithm satisfying the notion of multicalibration from Hebert-Johnson et al.. The second is variance and higher moment predictions, which corresponds to an online algorithm satisfying the notion of mean-conditioned moment multicalibration from Jung et al. Finally, we define a new notion of prediction interval multivalidity, and give an algorithm for finding prediction intervals which satisfy it. Because our algorithms handle adversarially chosen examples, they can equally well be used to predict statistics of the residuals of arbitrary point prediction methods, giving rise to very general techniques for quantifying the uncertainty of predictions of black-box algorithms, even in an online adversarial setting. When instantiated for prediction intervals, this solves a similar problem as conformal prediction, but in an adversarial environment and with multivalidity guarantees stronger than simple marginal coverage guarantees. Our techniques are fundamentally game-theoretic, extending an approach originally due to Fudenberg and Levine. Our algorithms result from equilibrium computations. This talk is based on a paper that is joint work with Varun Gupta, Christopher Jung, Georgy Noarov, and Mallesh Pai.

Constantinos Daskalakis (MIT), Jan. 22nd, 2021

Title: Equilibrium Computation and the Foundations of Deep Learning

Abstract: Deep Learning has recently yielded important advances in single-agent learning challenges, much of that progress being fueled by the empirical success of gradient descent and its variants in computing local optima of non-convex optimization problems. In multi-agent learning applications, the role of single-objective optimization is played by equilibrium computation, yet our understanding of its complexity in settings that are relevant for Deep Learning remains sparse. In this talk we focus on min-max optimization of nonconvex-nonconcave objectives, which has found applications in GANs, and other adversarial learning problems. Here, not only are there no known gradient-descent based methods converging to even local and approximate min-max equilibria, but the computational complexity of identifying them remains poorly understood. We show that finding approximate local min-max equilibria of Lipschitz and smooth objectives requires a number of queries to the function and its gradient that is exponential in the relevant parameters, in sharp contrast to the polynomial number of queries required to find approximate local minima of non-convex objectives. Our oracle lower bound is a byproduct of a complexity-theoretic result showing that finding approximate local min-max equilibria is computationally equivalent to finding Brouwer fixed points, and Nash equilibria in non zero-sum games, and thus PPAD-complete.

Minimal complexity theory knowledge will be assumed in the talk. Joint work with Stratis Skoulakis and Manolis Zampetakis.