Marianne Akian
Title: Solving irreducible stochastic mean-payoff games and entropy games by relative Krasnoselskii-Mann iteration
Abstract: We analyse an algorithm solving stochastic mean-payoff games, combining the ideas of relative value iteration and of Krasnoselskii--Mann damping. We derive parameterized complexity bounds for several classes of games including turn-based or concurent games satisfying irreducibility or ergodicity conditions. These bounds improve the ones of Chatterjee and Ibsen-Jensen (ICALP 2014), under the same ergodicity condition, and the one of Allamigeon, Gaubert, Katz and Skomra (ICALP 2022) in the particular case of turn-based games. We also establish parameterized complexity bounds for entropy games, a class of matrix multiplication games introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin (2016). We derive all these results by methods of variational analysis, establishing contraction properties of the relative Krasnoselskii--Mann iteration with respect to Hilbert's semi-norm.
This is a joint work with Stéphane Gaubert, Ulysse Naepels, and Basile Terver
Galit Ashkenazi-Golan
Title: Policy Gradient Methods in Repeated Games (Part two)
Abstract: See Edward Plumb.
Maurizio D'Andrea
Title: Playing against no-regret players
Abstract: In increasingly different contexts, it happens that a human player has to interact with artificial players who make decisions following decision-making algorithms. How should the human player play against these algorithms to maximize his utility? Does anything change if he faces one or more artificial players? The main goal of the paper is to answer these two questions. Consider n-player games in normal form repeated over time, where we call the human player optimizer, and the (n - 1) artificial players, learners. We assume that learners play no-regret algorithms and we consider the concept of Stackelberg equilibrium. In a 2-player game, the optimizer can always guarantee an expected average utility of at least the Stackelberg value per round. However, we show, with counterexamples, that this result does not hold if the optimizer has to face more than one player. Therefore, we generalize the definition of Stackelberg equilibrium introducing the concept of correlated Stackelberg equilibrium. Finally, in the main result, we prove that the optimizer can guarantee at least the correlated Stackelberg value per round. Moreover, using a version of the strong law of large numbers, we show that our result is also true almost surely for the optimizer utility instead of the optimizer's expected utility."
Janos Flesch
Title: The Martin function and its applications to stochastic games (Part 1)
Abstract: We consider multi-player stochastic games with finite action spaces, countable state spaces, and general payoffs: the payoff of each player is a bounded and Borel-measurable function of the infinite play. In the two-player zero-sum setting, Martin (1998), see also Maitra and Sudderth (1998), showed a powerful result: to each history one can associate a certain auxiliary one-shot game, with the same action spaces as in the stochastic game, such that a player can play well in the stochastic game by playing well locally in the one-shot game at each history. These auxiliary one-shot games are induced by a single function assigning a real number to each history, which we term the Martin function.
In the first part of the talk, we show an extension of the Martin function to multi-player stochastic games, both to the maxmin values and to the minmax values of the players. Our extension also satisfies another desirable property: the Martin function never assigns numbers to histories that are much lower than the maxmin values or the minmax values, respectively.
In the second part of the talk, by applying this extension of the Martin function, we derive several existence results in multi-player stochastic games with general payoffs. In particular, as time will permit, we show that for every ε>0:
(i) the minmax value is regular: for every Borel winning set W there is a closed subset C such that player i's minmax value when the winning set is C is at least her minmax value when the winning set is W, minus ε,
(ii) if each player has a Borel winning set, there is only one state, and the sum of the minmax values over the players exceeds the number of players minus 1, then an ε-equilibrium exists,
(iii) if the minmax values of the players are history-independent and there is only one state, then an ε-equilibrium exists.
(iv) there is a subgame in which an ε-equilibrium exists,and
(v) if there are only two players, the state space is finite, and the payoff functions are shift-invariant, then an ε-equilibrium exists.
Stephane Gaubert
Title: Game of escape rate
Abstract: We consider a new class of repeated zero-sum games in which the payoff of one player is the escape rate of a dynamical system which evolves according to a nonexpansive nonlinear operator depending on the actions of both players. Considering order preserving finite dimensional linear operators over the positive cone endowed with Hilbert's projective (hemi-)metric, we recover the matrix multiplication games, introduced by Asarin et al., which generalize the joint spectral radius of sets of nonnegative matrices and arise in some population dynamics problems (growth maximization and minimization). We establish a two-player version of Mañé's lemma characterizing the value of the game in terms of a nonlinear eigenproblem. This generalizes to the two-player case the characterization of joint spectral radii in terms of extremals norms. This also allows us to show the existence of optimal strategies of both players.
This is a joint work with Marianne Akian and Loic Marchesini
Hugo Gimbert
Title: Population Game
Abstract: TBA
Rasmuss Ibsen Jensen
Title: Algorithms for partial-information, multi-player stochastic mean-payoff games with bounded memory
Abstract: Partial-information, multi-player stochastic mean-payoff games are such that most general questions about them can be answered negatively: They do not have near equilibria and finding equilibria when they exist, might not be possible in a finite amount of time (they are undecidable), the latter even in the one player case. However, if we restrict players to only play with some given, bounded amount of memory (but arbitrary small probabilities), the question becomes feasible. Still, some such games, even when restricted to no memory, require double exponentially small numbers in the size of the game for epsilon-Nash equilibria, for epsilons>0 (when sufficiently small). We give an algorithm for the problem in the complexity class FNP^NP: In a nutshell, we show that we can encode any good strategy profile succinctly (with a polynomial number of digits in the size of the game) and still be able to find the outcome of the strategy profile for the players (up to epsilon) in polynomial time. The result implies that one can find (real - i.e. without restricting the players to bounded memory) epsilon-NE in quitting games, stay-in-a-set games and approximate the value in 2-player, zero-sum stochastic mean-payoff games with the same complexity. For the former two, that is the first algorithm for these problems and it is currently the most efficient algorithm for the latter.
This is based on joint work with Patrick Totzke and Sougata Bose.
Rida Laraki
Title: Online Learning in Identical-Interest and Zero-Sum Discounted Stochastic Games
Abstract: This presentation combines two recent articles where we extend a popular decentralized discrete-time learning procedure called fictitious play (Brown, 1951; Robinson, 1951) to discounted stochastic games (Shapley, 1953). We prove the converge of our online learning algorithms to the set of stationary Nash equilibria in identical-interest and zero-sum discounted stochastic games.
Joint work with Luca Baudin
David Lurie
Title: Decidability Analysis in a New Class of Unobservable MDPs Satisfying Ergodic Properties
Abstract: We introduce Unobservable Markov Decision Processes (UMDPs), emphasizing their relationship with decision problems and decidability. We then explore a novel theoretical class of UMDPs that leverages the property of weak ergodicity. We conclude by identifying necessary conditions for UMDPs to exhibit this characteristic, bridging the gap between this theoretical result and its potential practical implementation.
Richard Mayr
Title: Strategy Complexity of the Transience Objective in 2-Player Stochastic Games
Abstract: We consider 2-player stochastic games on countably infinite graphs where the players have finite action sets. A play is transient if it does not visit any state infinitely often. The players, Max and Min, seek to maximize (resp. minimize) the probability of transient plays. In general, neither player has optimal strategies. However, since the objective is Borel, the game is determined and epsilon-optimal strategies exist. We study the strategy complexity, i.e., how much memory is required for epsilon-optimal strategies by either player. Max does not require any memory, while Min needs
infinite memory (beyond a clock).
Abraham Neyman
Title: Stochastic games with limited memory space
Abstract: We study the size and type of memory resources that are needed for playing an $\ve$-optimal strategy in two-person zero-sum stochastic games. It is shown that (a) uniform epsilon-optimal strategies that use $O( \log n)$ public memories in the first n-stages of the game and do not use the clock exist, and (b) any strategy in the Big Match that uses the clock and a finite public memory is worthless. We also review previous results and raise several open problems.
Joint work with Kristoffer Arnsfelt Hansen and Rasmus Ibsen-Jensen.
Ivan Novikov
Title: Zero-Sum Stochastic Games with Vanishing Stage Duration and Public Signals
Abstract: We consider the behaviour of λ-discounted zero-sum stochastic games as the discount factor λ tends to 0 (that is, the players are more and more patient), in the context of games with stage duration. In stochastic games with stage duration h, players act at times 0, h, 2h, ..., the payoff and the leaving probabilities are proportional to h. When h tends to 0, this approximates games in continuous time. The asymptotic behavior of the values (when both λ and h tend to 0) was already studied in the case of stochastic games with perfect observation of the state. We consider the same question for the case of stochastic games with imperfect observation of the state. In such games, players are given a public signal that depends only on the current state.
We present an example of a stochastic game with public signals, in which there is no limit value (as the discount factor λ goes to 0) if stage duration is 1, and there is a limit value for the same game when stage duration h and discount factor λ both tend to 0. Informally speaking, it means that the limit value in discrete time does not exist, but the limit value in continuous time (i.e., when h tends to 0) exists. Such a situation is impossible in the case of stochastic games with perfect observation of the state.
Miquel Oliu-Barton
Title: TBA
Abstract: TBA
Edward Plumb
Title: Policy Gradient Methods in Repeated Games (Part One)
Abstract: Policy gradient methods have emerged as powerful tools for optimising complex decision-making processes. These methods utilise gradient ascent to iteratively enhance policies based on observed rewards. While they frequently achieve local convergence, global convergence guarantees remain elusive outside of specific classes of games. We illustrate this phenomenon through examples, illuminating the challenges surrounding global optimisation.
Following that, we explore the utilisation of policy gradient methods in repeated games and define the set of policies that are learnable using these methods. Lastly, we provide a preview of our follow-up presentation, wherein we will present a Folk theorem result for learning in repeated games.
Jérome Renault
Title: Games played by exponential weights algorithms
Abstract: Exponential weights updates are probably the building block which is the most used in machine learning algorithms for decision-making. We consider a repeated interaction, where each player uses an exponential weights algorithm characterized by an initial mixed action and an update parameter. The mixed action profile $p^t$ played at stage $t$ then follows an homogeneous Markov chain. In a first part, we show that the probability under $p^t$ to play a strict Nash equilibrium is a stochastic process almost surely converging to 0 or 1. In a second part, we show that any possible limit of $(p^t)$ has to be a Nash Equilibrium with Equalizing Payoffs. In a third part, we show that in strong coordination games where the payoff of a player is positive on the diagonal and 0 elsewhere, almost surely $(p^t)$ converges to one of the strict Nash equilibria. We conclude with open questions.
This is a joint work with Maurizio d’Andrea and Fabien Gensbittel.
Raimundo Saona
Title: Marginal value in Stochastic Games or POMDPs
Abstract: Zero-sum stochastic games are parameterized by payoffs, transitions, and possibly a discount rate. We study how the main solution concepts, the discounted and undiscounted values, vary when these parameters are perturbed.
We focus on the marginal values, introduced by Mills in 1956 in the context of matrix games, that is, the directional derivatives of the value along any fixed perturbation.
Our contributions are formulas for:
(1) the marginal value of a discounted stochastic game;
(2) (under mild assumptions) the limit of the marginal value as the discount rate vanishes; and
(3) (under mild assumptions) the marginal values of an undiscounted stochastic game.
Note that the last two differ in general.
Sylvain Sorin
Title: Topics on stochastic games with vanishing stage duration
Abstract: We consider a collection of stochastic games corresponding to finer and finer time discretizations of a continuous time process.
Eilon Solan
Title: The Martin function and its applications to stochastic games (Part 2)
Abstract: See Janos Flesch
Nicolas Vieille
Title: Stochastic game motivated by Optimization Problem
Abstract:TBA
Guillaume Vigeral
Title: Stochastic games with intermittent observation
Abstract: We consider asymptotics of finite zero-sum stochastic games as players grow more and more patient. Bewley and Kohlberg established in 1976 that the discounted value converges when players fully observe the state ; in 2016 Ziliotto constructed an example in which the state is not observed at all and the value oscillates. We investigate the middle ground in which the state is publicly observed at some stages, chosen either deterministically or randomly.
Joint work with Bruno Ziliotto
Yijun Wan
Title: Blackwell Approachability Applied to Absorbing Games
Abstract: Blackwell approachability, introduced in the 1950's, is a generalization of von Neumann’s minmax theorem for vector payoffs. An absorbing game is a two-player zero-sum repeated game with some entries that with positive probability, the future payoffs are equal to the payoffs of the entries. We first extend Blackwell approachability to vector spaces with time-dependent bilinear forms. Then we show how this framework can be applied to constructing an epsilon-optimal strategy in absorbing games. This is a joint work with Joon Kwon and Bruno Ziliotto.