Talks

Aditya Aradhye

Mechanism Design with Spiteful Agents

We study a mechanism-design problem in which spiteful agents strive to not only maximize their rewards but also, contingent upon their own payoff levels, seek to lower the opponents' rewards. To capture this spiteful nature of agents, we define a new incentive compatibility notion called spite-free incentive compatibility. We completely characterize the mechanisms which are individually rational and spite-free incentive compatible. We show that any mechanism is individually rational and spite-free incentive compatible if and only if it is a threshold mechanism - a concept similar to posted price mechanisms. Leveraging these findings, we partially extend our analysis to a problem with multiple items and copies. Overall, these results illuminate the challenges of auctioning items in the natural presence of other-regarding preferences.

Julián Chitiva

Continuous Social Networks

We develop an extension of the classical model of DeGroot (1974) to a continuum of agents when they interact among them according to a DiKernel. We show that, under some regularity assumptions, the continuous model is the limit case of the discrete one. We provide some applications of this result. First, we establish a canonical way to reduce the dimensionality of matrices by comparing matrices of different dimensions in the space of DiKernels. Then, we develop a model of Lobby Competition where two lobbies compete to bias the opinion of a continuum of agents. We give sufficient conditions for the existence of a Nash Equilibrium. Furthermore, we establish the conditions under which a Nash Equilibrium of the game induce an epsilon-Nash Equilibrium of the discretisation of the game. Finally, we put forward some elements for the characterisation of equilibrium strategies.

Felipe Garrido Lucero

DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation

We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification, which can be combined with Monte Carlo integration to overcome the computational tractability challenges. Such generic approximation methods, however, remain expensive in some cases. In this paper, we exploit the knowledge about the structure of the dataset valuation problem to devise more efficient Shapley value estimators. We propose a novel approximation, referred to as discrete uniform Shapley, which is expressed as an expectation under a discrete uniform distribution with support of reasonable size. We justify the relevancy of the proposed framework via asymptotic and non-asymptotic theoretical guarantees and illustrate its benefits via an extensive set of numerical experiments.

Atulya Jain

Calibrated Forecasting and Persuasion

How should an expert send forecasts to maximize her utility subject to passing a calibration test? We consider a dynamic game where an expert sends probabilistic forecasts to a decision maker. The decision maker uses a calibration test based on past outcomes to verify the expert’s forecasts. We characterize the optimal forecasting strategy by reducing the dynamic game to a static persuasion problem. A distribution of forecasts is implementable by a calibrated strategy if and only if it is a mean-preserving contraction of the distribution of conditionals (honest forecasts). We characterize the value of information by comparing what an informed and uninformed expert can attain. Moreover, we consider a decision maker who uses regret minimization, instead of the calibration test, to take actions. We show that the expert can always achieve the same payoff against a regret minimizer as under the calibration test, and in some instances, she can achieve strictly more. (joint work with Vianney Perchet)

Bar Light

Budget Pacing in Repeated Auctions: Regret and Efficiency without Convergence

We study the aggregate welfare and individual regret guarantees of dynamic pacing algorithms in the context of repeated auctions with budgets. Such algorithms are commonly used as bidding agents in Internet advertising platforms, adaptively learning to shade bids by a tunable linear multiplier in order to match a specified budget. We show that when agents simultaneously apply a natural form of gradient-based pacing, the liquid welfare obtained over the course of the learning dynamics is at least half the optimal expected liquid welfare obtainable by any allocation rule. Crucially, this result holds without requiring convergence of the dynamics, allowing us to circumvent known complexity-theoretic obstacles of finding equilibria. This result is also robust to the correlation structure between agent valuations and holds for any core auction, a broad class of auctions that includes first-price, second-price, and generalized second-price auctions as special cases. For individual guarantees, we further show such pacing algorithms enjoy dynamic regret bounds for individual value maximization, with respect to the sequence of budget-pacing bids, for any auction satisfying a monotone bang-for-buck property. To complement our theoretical findings, we provide semi-synthetic numerical simulations based on auction data from the Bing Advertising platform.

David Lurie

Asymptotic Value and Decidability in Ergodic Blind Stochastic Games

We study blind stochastic games, where the asymptotic value is generally not guaranteed to exist. To address this issue, we introduce the subclass of ergodic blind stochastic games, for which we prove that the asymptotic value exists. We develop an algorithm to approximate the asymptotic value by constructing a finite-state stochastic game through the aggregation of similar belief states. Furthermore, we show that this class is large enough that no algorithm can exist to compute the value exactly. We also consider stochastic games with asymmetric information and prove that our approximation scheme works under the same ergodic assumption, although the exact value computation remains infeasible in this context. Finally, we discuss the challenges in extending our method to Partially Observable Stochastic Games. (Joint work with Krishnendu Chatterjee, Raimundo Saona, and Bruno Ziliotto.)

Simon Mauras

Mechanism Design with Interdependent Values

The interdependent values model has gained a lot of recent attention in mechanism design, as it provides a tractable yet challenging theoretical framework to study strategic settings beyond the standard assumption of independent private values. Interdependence is well studied in economics, where the existence of efficient (optimal) mechanism requires strong assumptions. In response, works in computer science have tackled the impossibility through the lens of approximation. I plan to present recent advances taking an algorithmic approach, relaxing assumptions and achieving approximately optimal welfare.

Matteo Quattropani

Existence and Learning of Equilibria in Two-Player Random Games

Since the pioneering work of Nash, mathematical game theory has mostly focused on existence results concerning several notions of equilibrium, attempting to categorize games into classes for which such equilibria do or do not exist. Less interest has been directed toward the “average-case approach,” which shifts the focus to the frequency with which equilibria exist and their “typical properties.” In this spirit, random games—where the payoffs are sampled at random—constitute a natural starting point for analysis. In this talk, I will focus on normal form two-player games, where each player has K strategies, and I will be interested in the asymptotic regime in which K grows. In this context, I will recall both old and more recent results about the existence and learning of pure Nash equilibria in such random games, with particular emphasis on games that admit an ordinal potential structure.

Rebecca Reiffenhäuser

Single-Sample Prophet Inequalities

Many applications require assigning resources in an online fashion: decisions on incoming bids have to be made immediately, and before seeing all n bids in the sequence. A fundamental hardness inherent to this setting limits online strategies to perform no better than the one that simply picks a random winner for each good, so additional assumptions are needed. Famously, in prophet inequalities, one assumes access to all n (independent) distributions that the participant’s bids will be drawn from, achieving exp. competitive guarantees of up to half the expected offline optimum. Recently, considerable interest has been drawn to the fact that full prior distributional knowledge is in general not actually necessary to circumvent the impossibility. Instead, one can often achieve similar (or even equally good) approximations with access to just a few samples from each distribution. This data-driven approach poses a clear advantage in practical settings, where full distributions are rarely available (but one might, for example, know the bids the same participant placed in a few previous auctions). The talk presents selected results on the special case when the algorithm is guided by just one single sample, drawn from the product distribution, as prior information.

Raimundo Saona

POMDPs and Blind MDPs: (Dis)continuity of Values and Strategies

We consider the continuity properties of the classic dynamical stochastic model of partially-observable Markov decision processes (POMDPs) and the important subclass of blind Markov decision processes (blind MDPs). We study the standard value-continuity and introduce two notions of strategy-continuity. The weak (respectively strong) strategy-continuity requires that some (respectively all) approximately optimal strategies remain approximately optimal under small perturbations on the dynamic. The classic model of fully-observable MDPs is known to satisfy value-continuity and weak strategy-continuity. We show that POMDPs satisfy neither value-continuity, nor weak strategy-continuity, nor strong strategy-continuity. In contrast, we show that blind MDPs satisfy all three continuity properties. Moreover, we show that fully-observable MDPs do not satisfy strong strategy-continuity. Finally, we show that the problem of checking whether a given POMDP satisfies some of the continuity properties is algorithmically impossible because the problem is undecidable.

Flore Sentenac

Balancing Optimism and Pessimism in Offline-to-Online Learning

We consider what we call the offline-to-online learning setting, focusing on stochastic finite-armed bandit problems. In offline-to-online

learning problems, a learner starts with offline data collected from interaction with an unknown environment in a way that is not under the control of the learner. Given this data, the learner then starts interacting with the environment, gradually improving its initial strategy as it collects more data so as to maximize the total reward it receives from the environment. The basic dilemma of the learner in this setting is as follows: If the problem was purely offline (the learner is given data that it can use to design a policy, which then remains fixed), the best strategy (in a number of senses) would be to use the Lower Confidence Bound (LCB) algorithm. Among other things, LCB can simultaneously compete with any policy that is sufficiently “covered” by the offline data. However, if the problem was online, the best strategy (in a number of senses) would be to use the Upper Confidence Bound (UCB) algorithm, initialized with the offline data. Indeed, as time goes by, the UCB algorithm will match the performance of the optimal policy at a speed which is nearly the best possible among all online algorithms. Initially, however, UCB will explore too much, and as such, its performance will be inferior to that of LCB at least for some time. This suggests that the learner should use LCB initially and then gradually it should change its strategy to resemble UCB. Just how this should be done (and why) is the subject of this article. In particular, our main result shows that our new algorithm performs nearly as well as the better of LCB and UCB at any point in time. The main idea underlying the design of the new algorithm is general and we expect that our results can be generelized beyond the multi-armed bandit setting we consider here.

Laura Vargas Koch

Nash Flows over Time

In traffic planning, traffic simulations are widely used. However, good traffic simulations require good mathematical models as a foundation, and one such model is Nash flows over time. Unlike static, classical models, the unique aspect of Nash flows over time is that temporal changes can be depicted. This means that a flow moves through the network over time, and the situation in the network can change dynamically. In this presentation, we will learn about the model of Nash flows over time, understand its close connection to the traffic simulation software MATSim and see what properties of the equilibria have already been understood, and what questions are still open.

Page updated

Google Sites

Report abuse