Research

A Deep Learning Approach to Heterogeneous Consumer Aesthetics in Retail Fashion (Working Paper)

In some markets, the visual appearance of a product matters a lot. This paper investigates consumer transactions from a major fashion retailer, focusing on consumer aesthetics. Pre-trained multimodal models convert images and text descriptions into high-dimensional embeddings. The value of these embeddings is verified both empirically and by their ability to segment the product space. A discrete choice model is used to decompose the distinct drivers of consumer choice: price, visual aesthetics, descriptive details, and seasonal variations. Consumers are allowed to differ in their preferences over these factors, both through observed variation in demographics and allowing for unobserved types. Estimation and inference employ automatic differentiation and GPUs, making it scalable and portable. The model reveals significant differences in price sensitivity and aesthetic preferences across consumers. The model is validated by its ability to predict the relative success of new designs and purchase patterns.

The Effectiveness of Advertising in Second-Hand Retail Fashion (Working Paper)

In this paper, we study the effects of the advertising funnel on user purchase behaviour in a very large online marketplace for second-hand retail fashion. We leverage a new proprietary dataset covering 1 billion dollars worth of transactions, 1 million products and over 100,000 users. Since most users buy only once, and most sellers sell only once; traditional causal inference methods like panel data analysis cannot be applied. We define a multi-day shopping session and match user-product pairs by creating user, seller and product embeddings. Then we estimate the impact of clicks on purchases and revenues using a causal machine learning methods. Using the same approach, we also study the impact of ad-rank on user propensity to click. We also find significant heterogeneity across users and sellers and conduct sensitivity analysis to confirm the validity of our results.

Who is more Bayesian: Humans or ChatGPT? (with John Rust, Chengjun Zhang, Tianshi Mu and Aaron Zhong)

We compare the performance of human and artificially intelligent (AI) decision makers in simple binary classification tasks where the optimal decision rule is given by Bayes Rule. We reanalyze choices of human subjects gathered from laboratory experiments conducted by El-Gamal and Grether and Holt and Smith. We confirm that while overall Bayes Rule represents the single best model for predicting human choices, subjects are heterogeneous and a significant share of them make suboptimal choices that reflect judgment biases described by Kahneman and Tversky. These biases include the "representativeness heuristic" (excessive weight on the evidence from the sample relative to the prior) and "conservatism" (excessive weight on the prior relative to the sample). We compare the performance of AI subjects gathered from recent versions of large language models (LLMs), including several versions of ChatGPT. These general-purpose generative AI chatbots are not specifically trained to excel in narrow decision-making tasks but are instead trained as "language predictors" using a large corpus of textual data from the web. We show that ChatGPT is also subject to biases that result in suboptimal decisions. However, we document a rapid evolution in the performance of ChatGPT, transitioning from sub-human performance in early versions (ChatGPT 3.5) to superhuman and nearly perfect Bayesian classifications in the latest versions (ChatGPT 4.0).

Structural Econometrics and Reinforcement Learning (Oxford Research Encyclopedia, with John Rust)

This survey article explores the synergies between structural econometrics and reinforcement learning. Structural econometrics interprets observed economic choices as optimal decisions under constraints, enabling counterfactual prediction of behavior under rule changes. Reinforcement learning offers a framework for learning optimal policies in complex multi-step problems through exploration and exploitation. We identify opportunities for cross-fertilization between these fields. Structural econometrics can leverage reinforcement learning algorithms to solve previously intractable high-dimensional economic models and games. Inverse reinforcement learning provides econometricians with new methods to recover agents' objective functions from observed behavior. Reinforcement learning, in particular bandits, can be enhanced by incorporating economic theory and structural assumptions, accelerating learning and improving sample complexity by orders of magnitude. We review methodological connections, demonstrate applications across finance, industrial organization, public policy, and marketing. Both fields create new tools for inference and decision-making while tackling the shared challenges of the curse of dimensionality, equilibrium multiplicity, and identification.

Algorithmic Collusion in Auctions: Evidence from Controlled Laboratory Experiments (WEAI Conference 2025)

Algorithms are increasingly being used to automate participation in online markets. Banchio and Skrzypacz (2022) demonstrate how exploration under identical valuation in first-price auctions may lead to spontaneous coupling into sub-competitive bidding. However, it is an open question if these findings extend to affiliated values, optimal exploration, and specifically which algorithmic details play a role in facilitating algorithmic collusion. This paper contributes to the literature by generating robust stylized facts to cover these gaps. I conduct a set of fully randomized experiments in a controlled laboratory setup and apply double machine learning to estimate granular conditional treatment effects of auction design on seller revenues. I find that first-price auctions lead to lower seller revenues and higher seller regret under identical values, affiliated values, and under both Q-learning and Bandits. There is more possibility of such tacit collusion under fewer bidders, Boltzmann exploration, asynchronous updating, and longer episodes; while high reserve prices can offset this. This evidence suggests that programmatic auctions, e.g. the Google Ad Exchange, which depend on first-price auctions, might be susceptible to coordinated bid suppression and significant revenue losses.

Approximating Auction Equilibria with Reinforcement Learning (Working Paper)

Traditional methods for computing equilibria in auctions become computationally intractable as auction complexity increases, particularly in multi-item and dynamic auctions. This paper introduces a self-play based reinforcement learning approach that employs advanced algorithms such as Proximal Policy Optimization to approximate Bayes-Nash equilibria. This framework allows for continuous action spaces, high-dimensional information states, and delayed payoffs. Through self-play, these algorithms can learn robust and near-optimal bidding strategies in auctions with known equilibria, including those with symmetric and asymmetric valuations, private and interdependent values, and multi-round auctions.

Reinforcement and Agentic Learning in Double Auctions (Working Paper)

In 1993, the Santa Fe Institute hosted a seminal tournament where simple "sniping" heuristics outperformed complex trading algorithms in a discrete double auction. Thirty years later, we revisit this environment to investigate whether modern Deep Reinforcement Learning (PPO) and Large Language Models (GPT-5) can solve the information aggregation problem without hard-coded rules. We also benchmark them against classic traders like Zero Intelligence Plus (ZIP) and Gjerstad-Dickhaut (GD). We faithfully replicate the original synchronized double auction mechanism and introduce these new generation of trading agents. Our results show that: (1) PPO agents autonomously rediscover the "sniping" strategy, exploiting legacy heuristics; (2) Multi-agent PPO markets maintain high allocative efficiency, avoiding the market collapse observed in heuristic self-play; and (3) Zero-shot LLMs exhibit high efficiency but are more conservative. These findings suggest that while gradient-based learning can master market timing, semantic reasoning introduces a new, potentially stabilizing, dynamic to automated markets.

Page updated

Report abuse