Liangzong Ma, Ta-Wei Huang, Eva Ascarza, and Ayelet Israeli (2025)
Major Revision at Marketing Science [Paper]
Abstract
Reinforcement learning (RL) offers potential for optimizing sequences of customer interactions by modeling the relationships between customer states, company actions, and long-term value. However, its practical implementation often faces significant challenges. First, while companies collect detailed customer characteristics to represent customer states, these data often contain noise or irrelevant information, obscuring the true customer states. Second, existing state construction techniques focus primarily on summarizing characteristics related to short-term values, rather than capturing the broader behaviors that drive long-term customer value. These limitations hinder RL's ability to effectively learn customer dynamics and maximize long-term value. To address these challenges, we introduce a novel Multi-Response State Representation (MRSR) Learning method to enhance existing RL methods. Unlike state construction methods, MRSR utilizes rich customer signals-such as recency, engagement, and spending-to construct low-dimensional state representations that effectively predict behaviors driving long-term customer value. Using data from a free-to-play mobile game with dynamic difficulty adjustments, MRSR demonstrates significant improvements, increasing 30-day in-game currency spending by 37% compared to standard offline RL methods and 24% over advanced state representation techniques. Policy interpretation further highlights MRSR's ability to identify distinct and relevant customer states, enabling precise and targeted interventions to boost long-term engagement and spending.