Principal-Agent Decision-Making Processes
Tutorial @ The 20th Conference on Web and Internet Economics (WINE 2024)
Lecture Theatre G.03, University of Edinburgh
9:30am - 12:00pm GMT, Monday December 2, 2024
The principal-agent problem is a well-established model in microeconomics that addresses challenges arising from misaligned interests and information in economic interactions. Over time, the model has evolved along with the increasing complexity of modern decision-making systems. There is growing interest in developing unified theories and computational techniques to solve these problems, with focuses on incentive structures, information asymmetry, as well as sophisticated real-world applications involving unknown or dynamic environments. This tutorial will provide both foundational and cutting-edge insights into the principal-agent problem, covering basic models, computational methods for optimization and learning, as well as practical applications in the areas of contract design, information design, and multi-agent reinforcement learning.
Outline
Basic Models and Solution Concepts
We begin with several toy examples of the principal-agent problem, based on real-world applications in contract design [15, 20, 10], information design [18, 19, 7], and Stackelberg games [23, 22, 9]. Through these examples, we provide the audience with basic intuition and understanding of the principal-agent problem. We then introduce a model that formalises the above mentioned problems, as well as a broader range of applications (e.g., [1, 4, 8, 26]) under a unified framework. The model is based on key works in the literature [21, 12]. We will also discuss extensions of the model to sequential settings, including Markov persuasion processes [14, 25, 17, 5, 2], stochastic principal-agent games [13], and sequential contract design [6, 24, 16]. Commonly used strategy types and equilibrium concepts for sequential games with be introduced. Real-world applications in digital platforms and online marketplaces will be highlighted during the discussions.
Equilibrium Computation
We introduce the basic computational and algorithmic results for principal-agent problem, under the unified framework. We will zoom into the computational trade-offs between different design choices with respect to various factors, such as agent behavior models, optimization horizon, strategy types, and solution concepts. We present hardness results some of the design choices lead to, as well as tractable cases for which we will introduce efficient algorithms for computing the equilibria. Techniques such as dynamic programming and value polytope construction will be introduced, in connection with recent work in the literature [14, 5, 16, 24, 13].
Principal-Agent Reinforcement Learning
The final part of the tutorial will move from the offline to the online setting. We consider the online reinforcement setting where the environment is not initially known (or only partially known) and the principal needs to learn about the environment while interacting with the agent. We will introduce formalization of the setting and highlight the unique challenges presented [3, 11]. In particular, how to ensure the incentive compatibility constraint under uncertainty? How to balance the trade-offs between exploration and exploitation in presence of misaligned incentives between the principal and the agent? We will present sample-efficient learning algorithms proposed in the recent literature [25, 5, 24, 13].
The Organizers
Reference
[1] Moshe Babaioff, Robert Kleinberg, and Renato Paes Leme. 2012. Optimal mechanisms for selling information. In Proceedings of the 13th ACM Conference on Electronic Commerce. 92–109.
[2] Francesco Bacchiocchi, Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, and Nicola Gatti. 2024. Markov persuasion processes: Learning to persuade from scratch. arXiv preprint arXiv:2402.03077 (2024).
[3] Maria-Florina Balcan, Avrim Blum, Nika Haghtalab, and Ariel D Procaccia. 2015. Commitment without regrets: Online learning in stackelberg security games. In Proceedings of the sixteenth ACM conference on economics and computation. 61–78.
[4] Dirk Bergemann, Alessandro Bonatti, and Alex Smolin. 2018. The design and price of information. American economic review 108, 1 (2018), 1–48.
[5] Martino Bernasconi, Matteo Castiglioni, Alberto Marchesi, and Mirco Mutti. 2024. Persuading farsighted receivers in mdps: the power of honesty. Advances in Neural Information Processing Systems 36 (2024).
[6] Gianluca Brero, Alon Eden, Darshan Chakrabarti, Matthias Gerstgrasser, Vincent Li, and David C Parkes. 2022. Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design. arXiv preprint arXiv:2210.03852 (2022).
[7] Matteo Castiglioni, Alberto Marchesi, and Nicola Gatti. 2022. Bayesian Persuasion Meets Mechanism Design: Going Beyond Intractability with Type Reporting. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems. 226–234.
[8] Yiling Chen, Haifeng Xu, and Shuran Zheng. 2020. Selling information through consulting. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2412–2431.
[9] Vincent Conitzer and Tuomas Sandholm. 2006. Computing the optimal strategy to commit to. In Proceedings of the 7th ACM conference on Electronic commerce. 82–90.
[10] Paul Dutting, Tim Roughgarden, and Inbal Talgam-Cohen. 2021. The complexity of contracts. SIAM J. Comput. 50, 1 (2021), 211–254.
[11] Jiarui Gan, Minbiao Han, Jibang Wu, and Haifeng Xu. 2023. Robust Stackelberg Equilibria. In Proceedings of the 24th ACM Conference on Economics and Computation. 735–735.
[12] Jiarui Gan, Minbiao Han, Jibang Wu, and Haifeng Xu. 2024. Generalized Principal-Agency: Contracts, Information, Games and Beyond. arXiv:2209.01146 [cs.GT] https://arxiv.org/abs/2209.01146
[13] Jiarui Gan, Rupak Majumdar, Debmalya Mandal, and Goran Radanovic. 2024. Stochastic Principal-Agent Problems: Efficient Computation and Learning. arXiv:2306.03832 [cs.GT] https://arxiv.org/abs/2306.03832
[14] Jiarui Gan, Rupak Majumdar, Goran Radanovic, and Adish Singla. 2022. Bayesian persuasion in sequential decision-making. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 5025–5033.
[15] Sanford J Grossman and Oliver D Hart. 1992. An analysis of the principal-agent problem. In Foundations of insurance economics. Springer, 302–340.
[16] Dima Ivanov, Paul Dutting, Inbal Talgam-Cohen, Tonghan Wang, and David C Parkes. 2024. Principal-Agent Reinforcement Learning. arXiv preprint arXiv:2407.18074 (2024).
[17] Krishnamurthy Iyer, Haifeng Xu, and You Zu. 2023. Markov Persuasion Processes with Endogenous Agent Beliefs. arXiv preprint arXiv:2307.03181 (2023).
[18] Emir Kamenica and Matthew Gentzkow. 2011. Bayesian persuasion. American Economic Review 101, 6 (2011), 2590–2615.
[19] Anton Kolotilin, Tymofiy Mylovanov, Andriy Zapechelnyuk, and Ming Li. 2017. Persuasion of a privately informed receiver. Econometrica 85, 6 (2017), 1949–1964.
[20] Jean-Jacques Laffont and David Martimort. 2009. The theory of incentives. In The Theory of Incentives. Princeton university press.
[21] Roger B Myerson. 1982. Optimal coordination mechanisms in generalized principal–agent problems. Journal of mathematical economics 10, 1 (1982), 67–81.
[22] Praveen Paruchuri, Jonathan P Pearce, Janusz Marecki, Milind Tambe, Fernando Ordonez, and Sarit Kraus. 2008. Playing games for security: An efficient exact algorithm for solving Bayesian Stackelberg games. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 2. 895–902.
[23] Heinrich von Stackelberg. 1934. Marktform und gleichgewicht. (1934).
[24] Jibang Wu, Siyu Chen, Mengdi Wang, Huazheng Wang, and Haifeng Xu. 2024. Contractual Reinforcement Learning: Pulling Arms with Invisible Hands. arXiv preprint arXiv:2407.01458 (2024).
[25] Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I Jordan, and Haifeng Xu. 2022. Markov persuasion processes and reinforcement learning. In ACM Conference on Economics and Computation.
[26] Kai Hao Yang. 2022. Selling consumer data for profit: Optimal market-segmentation design and its consequences. American Economic Review 112, 4 (2022), 1364–93.