[updated] Human Learning about AI (with Bnaya Dreyfuss) - Extended abstract at EC'25
Abstract: We study Human Projection (HP): people's tendency to evaluate AI using the same assessment frameworks they use for humans—treating features such as task difficulty and the reasonableness of mistakes as diagnostic of overall ability. We formalize HP and its consequences for equilibrium adoption, testing its predictions in experiments. First, people project human difficulty onto AI, over-estimating performance on human-easy tasks, under-estimating it on human-hard ones, and over-updating after easy failures and hard successes—leading to systematic misspecification when AI performance is jagged rather than human-ordered. Second, HP interprets observed performance through a single ability index, which induces all-or-nothing adoption even when AI outperforms humans only on some tasks; experimentally stripping AI of human-like cues weakens cross-task generalization and reduces over-adoption. Finally, a field experiment with a parenting-advice chatbot shows equally unhelpful mistakes that are less humanly reasonable cause larger drops in trust and future engagement. Anthropomorphic AI design can amplify HP, misaligning beliefs with actual capabilities and distorting adoption.
Media Coverage:
Harvard Horizons "Ted Talk style" presentation
[new] Strategic Slanting (supercedes "Signaling Universalism")
Abstract: How do social image concerns affect displayed group preferences? I provide experimental evidence that decisions makers (DMs) engage in strategic slanting: they distort their public behavior towards their perception of their audience’s preference to induce audiences to act prosocially towards them. DMs play a universalism game, dividing money between an in-group and an out-group member, where I manipulate the existence and identity—in-group or out-group—of an audience with whom the DM expects downstream interactions. Across three types of interactions—prisoner’s dilemma, dictator game, and no game—DMs act significantly more universalist when facing out-group audiences and in-group audiences perceived as universalist, while they act more communal when facing in-group audiences perceived as communal. Shocking the perceived audience’s strength of group identity, I find that social cues inform DMs on the direction of their slanting, which is performed
only when DMs believe it can affect their audience’s prosociality. Findings are consistent with a model where DMs strategically display preference alignment to audiences who are altruistic towards like-minded people. Slanting-driven alignment is highly effective in raising audience prosociality and allows DMs to achieve cooperation levels with the out-group that are on par with the in-group, suggesting social image concerns may be strategically leveraged to improve collaborative outcomes across social groups.
AI Sycophancy and Human Decision-Making (with Sinan Aral, Harang Ju, and Rui Zuo)
AI-Assisted Learning (with Yiling Chen, Jeff Jiang, and Gali Noti)
Signaling and Optimal Break-Taking (with Katja Michlbauer and Lexi Schubert)
Gradoz, J., & Raux, R. (2021). Trolling in the Deep : Managing Transgressive Content on Online Platforms as a Commons. In Erwin Dekker and Pavel Kuchar (eds), Governing Markets as Knowledge Commons. Cambridge : Cambridge University Press, 217-237.