A Contextual Bandit Problem with a Bounded (O(1)) Regret Policy [PDF]
Hyunwook Kang (Texas A&M University)*; P. R. Kumar (Texas A&M University)
Causal Adaptive Learning for Recommendations [PDF]
Maria Dimakopoulou (Spotify)*
Control Variate Diagnostics for Detecting Problems in Logged Bandit Feedback [PDF]
Ben London (Amazon)*; Thorsten Joachims (Cornell)
Extending Open Bandit Pipeline to Simulate Industry Challenges [PDF]
Bram van den Akker (Booking.com)*; Niklas Weber (booking.com); Felipe Moraes (Booking.com); Dmitri Goldenberg (Booking.com)
Modelling User Preferences using a Partially Observed Markov Decision Problem for a Reinforcement Learning Sequence-Aware Recommender [PDF]
Aayush S Roy (UCD Dublin)*; Aonghus Lawlor (UCD); Neil Joseph Hurley (University College Dublin)
OFRL: Designing an Offline Reinforcement Learning and Policy Evaluation Platform from Practical Perspectives [PDF]
Haruka Kiyohara (Tokyo Institute of Technology)*; Kosuke Kawakami (negocia, Inc.)
Safe and Deployment Efficient Policy Learning for Exploring Novel Actions in Recommender Systems [PDF]
Haruka Kiyohara (Tokyo Institute of Technology)*; Yusuke Narita (Yale University); Kei Tateno (Sony Group Corporation); Takuma Udagawa (Sony Group Corporation)
Sales Channel Optimization via Simulations based on Observational Data with Delayed Rewards: A Case Study at LinkedIn [PDF]
Diana Negoescu (LinkedIn Corporation)*; Pasha Khosravi (University of California Los Angeles); Shadow Zhao (LinkedIn Corporation); Nanyu Chen (Gopuff); Parvez Ahammad (LinkedIn); Humberto Gonzalez (LinkedIn Corporation)
SkipAwareRec: A Sequential and Interactive Music Recommendation System [PDF]
Rui Ramos (INESC TEC); João Vinagre (LIAAD - INESC TEC)*
When to Target Customers? Retention Management using Dynamic Off-Policy Policy Learning [PDF]
Kosuke Uetake (Yale School of Management)*; Kohei Yata (Yale University); Ryosuke Okada (ZOZO Inc.); Ryuya Ko (University of Tokyo)
Adaptive Experimental Design and Counterfactual Inference [PDF]
Tanner Fiez (Amazon)*; Lalit Jain (University of Washington); Houssam Nassif (amazon); Sergio Gamez (Amazon); Arick Chen (Amazon)
Are Neural Click Models Pointwise IPS Rankers? [PDF]
Philipp K Hager (University of Amsterdam)*; Maarten de Rijke (University of Amsterdam); Onno Zoeter (Booking)
Causal Evaluation of Item Fairness in Impression Delivery [PDF]
Winston Chou (Netflix)*; Nathan Kallus (Cornell University)
CLEAR: Causal Explanations from Attention in Neural Recommenders [PDF]
Shami Nisimov (Intel Labs); Raanan Y. Rohekar (Intel Labs)*; Yaniv Gurwicz (Intel Labs); Guy Koren (Intel Labs); Gal Novik (Intel Labs)
Improving Accuracy of Off-Policy Evaluation via Policy Adaptive Estimator Selection [PDF]
Takuma Udagawa (Sony Group Corporation)*; Haruka Kiyohara (Tokyo Institute of Technology); Yusuke Narita (Yale University); Kei Tateno (Sony Group Corporation)
Leveraging Context-dependent Click Model for Off-Policy Evaluation of Ranking Policies [PDF]
Haruka Kiyohara (Tokyo Institute of Technology)*; Nobuyuki Shimizu (Yahoo Japan Corporation); Yasuo Yamamoto (Yahoo! Japan)
Off-policy evaluation for learning-to-rank via interpolating the item-position model and the position-based model [PDF]
Alexander Buchholz (Amazon)*; Ben London (Amazon); Giuseppe Di Benedetto (Amazon); Thorsten Joachims (Cornell)
Significant heterogeneous double machine learning for recommendation [PDF]
John S Moreland (Amazon)*; Zuqi Shang (Amazon)
The Bandwagon Effect: Not Just Another Bias [PDF]
Norman Knyazev (Radboud University)*; Harrie Oosterhuis (Radboud University)
VAE-IPS: A Deep Generative Recommendation Method for Unbiased Learning From Implicit Feedback [PDF]
Shashank Gupta (University of Amsterdam)*; Harrie Oosterhuis (Radboud University); Maarten de Rijke (University of Amsterdam)