Data plays a fundamental role in machine learning, yet its collection raises critical questions regarding user incentives, privacy concerns, and fair compensation. Users often face a trade-off between accessing services in exchange for their data and maintaining their privacy. At the same time, determining the value of data and designing appropriate compensation mechanisms remain open challenges. This tutorial reviews recent works that address these issues.
The tutorial will be two hours with a 15-minute break in between. It will cover the following four main topics:
I. A Crash Course on Privacy
An overview of key privacy definitions, laying the groundwork for understanding data as an economic good.
II. A Design Approach to Data Acquisition
An exploration of mechanisms and incentives for data collection, including strategies for eliciting data from privacy-sensitive users.
III. User and Societal Welfare in Data Markets
An analysis of how data markets impact individual and societal welfare, with a focus on efficiency and the challenges posed by data externalities.
IV. Emerging Topics and Future Directions
A forward-looking discussion on evolving issues such as data ownership, copyright, data governance, and the intersection of AI and data policy.
Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. ``Too Much Data: Prices and Inefficiencies in Data Markets,’’ 2022, American Economic Journal: Microeconomics 14 (4): 218–56.
Alireza Fallah, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. ``Optimal and Differentially Private Data Acquisition: Central and Local Mechanisms,” 2023, Operations Research 72:3, 1105-1123.
Alireza Fallah, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. ``Bridging Central and Local Differential Privacy in Data Acquisition Mechanisms’’, 2022, NeurIPS
Alireza Fallah, Michael I. Jordan and Ali Makhdoumi and Azarakhsh Malekian, ``On Three-Layer Data Markets,” 2024, ArXiv.
Dirk Bergemann, Alessandro Bonatti, and Tan Gan, ``The economics of social data”, 2022, The RAND Journal of Economics, 53: 263-296.
Dirk Bergemann and Alessandro Bonatti. ``Data, Competition, and Digital Platforms,’’ 2024, American Economic Review 114 (8): 2553–95
Arpita Ghosh and Aaron Roth, ``Selling privacy at auction,’’ 2011, 12th ACM conference on Electronic commerce (EC ‘11)
Rachel Cummings, Katrina Ligett, Aaron Roth, Zhiwei Steven Wu, and Juba Ziani. ``Accuracy for sale: Aggregating data with a variance constraint,’’ 2015, Conference on Innovations in Theoretical Computer Science, pp. 317-324.
Hadi Elzayn, Emmanouil Pountourakis, Vasilis Gkatzelis, and Juba Ziani, ``Optimal Data Acquisition with Privacy-Aware Agents,’’ 2023, IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)
Shota Ichihashi, ``Competing data intermediaries,’’ 2021, The RAND Journal of Economics, 52: 515-537
Daron Acemoglu, Alireza Fallah, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. ``How good are privacy guarantees? platform architecture and violation of user privacy,’’ 2023, National Bureau of Economic Research.
Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. ``When Big Data Enables Behavioral Manipulation,’’ 2025, American Economic Review: Insights 7 (1): 19–38.