Contributions
REVEAL '22
A Contextual Bandit Problem with a Bounded (O(1)) Regret Policy [PDF]
Hyunwook Kang (Texas A&M University)*; P. R. Kumar (Texas A&M University)Causal Adaptive Learning for Recommendations [PDF]
Maria Dimakopoulou (Spotify)*Control Variate Diagnostics for Detecting Problems in Logged Bandit Feedback [PDF]
Ben London (Amazon)*; Thorsten Joachims (Cornell)Extending Open Bandit Pipeline to Simulate Industry Challenges [PDF]
Bram van den Akker (Booking.com)*; Niklas Weber (booking.com); Felipe Moraes (Booking.com); Dmitri Goldenberg (Booking.com)Modelling User Preferences using a Partially Observed Markov Decision Problem for a Reinforcement Learning Sequence-Aware Recommender [PDF]
Aayush S Roy (UCD Dublin)*; Aonghus Lawlor (UCD); Neil Joseph Hurley (University College Dublin)OFRL: Designing an Offline Reinforcement Learning and Policy Evaluation Platform from Practical Perspectives [PDF]
Haruka Kiyohara (Tokyo Institute of Technology)*; Kosuke Kawakami (negocia, Inc.)Safe and Deployment Efficient Policy Learning for Exploring Novel Actions in Recommender Systems [PDF]
Haruka Kiyohara (Tokyo Institute of Technology)*; Yusuke Narita (Yale University); Kei Tateno (Sony Group Corporation); Takuma Udagawa (Sony Group Corporation)Sales Channel Optimization via Simulations based on Observational Data with Delayed Rewards: A Case Study at LinkedIn [PDF]
Diana Negoescu (LinkedIn Corporation)*; Pasha Khosravi (University of California Los Angeles); Shadow Zhao (LinkedIn Corporation); Nanyu Chen (Gopuff); Parvez Ahammad (LinkedIn); Humberto Gonzalez (LinkedIn Corporation)SkipAwareRec: A Sequential and Interactive Music Recommendation System [PDF]
Rui Ramos (INESC TEC); João Vinagre (LIAAD - INESC TEC)*When to Target Customers? Retention Management using Dynamic Off-Policy Policy Learning [PDF]
Kosuke Uetake (Yale School of Management)*; Kohei Yata (Yale University); Ryosuke Okada (ZOZO Inc.); Ryuya Ko (University of Tokyo)
CONSEQUENCES '22
Adaptive Experimental Design and Counterfactual Inference [PDF]
Tanner Fiez (Amazon)*; Lalit Jain (University of Washington); Houssam Nassif (amazon); Sergio Gamez (Amazon); Arick Chen (Amazon)Are Neural Click Models Pointwise IPS Rankers? [PDF]
Philipp K Hager (University of Amsterdam)*; Maarten de Rijke (University of Amsterdam); Onno Zoeter (Booking)Causal Evaluation of Item Fairness in Impression Delivery [PDF]
Winston Chou (Netflix)*; Nathan Kallus (Cornell University)CLEAR: Causal Explanations from Attention in Neural Recommenders [PDF]
Shami Nisimov (Intel Labs); Raanan Y. Rohekar (Intel Labs)*; Yaniv Gurwicz (Intel Labs); Guy Koren (Intel Labs); Gal Novik (Intel Labs)Improving Accuracy of Off-Policy Evaluation via Policy Adaptive Estimator Selection [PDF]
Takuma Udagawa (Sony Group Corporation)*; Haruka Kiyohara (Tokyo Institute of Technology); Yusuke Narita (Yale University); Kei Tateno (Sony Group Corporation)Leveraging Context-dependent Click Model for Off-Policy Evaluation of Ranking Policies [PDF]
Haruka Kiyohara (Tokyo Institute of Technology)*; Nobuyuki Shimizu (Yahoo Japan Corporation); Yasuo Yamamoto (Yahoo! Japan)Off-policy evaluation for learning-to-rank via interpolating the item-position model and the position-based model [PDF]
Alexander Buchholz (Amazon)*; Ben London (Amazon); Giuseppe Di Benedetto (Amazon); Thorsten Joachims (Cornell)Significant heterogeneous double machine learning for recommendation [PDF]
John S Moreland (Amazon)*; Zuqi Shang (Amazon)The Bandwagon Effect: Not Just Another Bias [PDF]
Norman Knyazev (Radboud University)*; Harrie Oosterhuis (Radboud University)VAE-IPS: A Deep Generative Recommendation Method for Unbiased Learning From Implicit Feedback [PDF]
Shashank Gupta (University of Amsterdam)*; Harrie Oosterhuis (Radboud University); Maarten de Rijke (University of Amsterdam)