Invited speakers

We are excited to introduce a fantastic line-up of invited speakers at REVEAL'20!

Inbar Naor, Algorithms Team Lead at Taboola

Challenges in Evaluating Exploration Effectiveness in Recommender Systems

Exploration is an important pillar of many interactive systems. Particularly, in recommender systems, exploration provides the ability to not only serve well-established items, but to explore new items that might have unfulfilled potential.

Measuring the effectiveness of our exploration mechanisms is a challenging task; while many theoretical results rely on regret bounds or measuring regret via simulations, defining and measuring the regret in real world scenarios is not trivial. Furthermore, measuring the gain from a given exploration method is even less trivial as there are many objectives and constraints we may want to consider.

In this talk we will focus on different challenges in defining exploration metrics, present different approaches to do so, discuss the key properties we may want from such metrics and examine methods to validate them.

Inbar Naor is an Algorithms Team Lead at Taboola, where she works on content and ads recommendations. Her main interests are in Machine Learning for Recommender Systems. Most recently, she focused on exploration and estimating uncertainty in the domain of display ads.

Amit Sharma, Senior Researcher at Microsoft

Causal Inference for Recommender Systems

What is the impact of a recommender system? In a typical three-way interaction between users, items and the platform, a recommender system can have differing impacts on the three stakeholders, and there can be multiple metrics based on utility, diversity, and fairness. One way to measure impact is through randomized A/B tests, but experiments are costly and can only be applied for short-term outcomes. In this talk I will describe a unifying framework based on causality that can be used to answer such questions. Using the example of a recommender system's effect on increasing sales for a platform, I will discuss the four steps that form the basis of a causal analysis: modeling the causal mechanism, identifying the correct estimand, estimation, and finally checking robustness of the obtained estimates. Utilizing independence assumptions common in click log data, this process led to a new method for estimating impact of recommendations, called the split-door causal criterion. In the later half of the talk, I will show how the four steps can be used to address otherw questions such as selection bias, missing data, and fairness questions about a recommender system.

Amit Sharma is a Senior Researcher at Microsoft Research India. His work focuses on understanding the causal mechanisms that shape people’s activities as they interact with algorithmic systems such recommendation systems and online social platforms, especially in healthcare. Methodologically, his work bridges causal inference techniques with data mining and machine learning, with the goal of making machine learning models generalize better, be explainable and avoid hidden biases. To this end, Amit has co-led the development of the open-source DoWhy library for causal inference. His work has received many awards including a Best Paper Honorable Mention Award at the 2016 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW), the 2012 Yahoo! Key Scientific Challenges Award and the 2009 Honda Young Engineer and Scientist Award. Amit received his Ph.D. in computer science from Cornell University and B.Tech. in Computer Science and Engineering from Indian Institute of Technology (IIT) Kharagpur.

Minmin Chen, Staff Research Scientist at Google

New Applications of Bandits and Reinforcement Learning Algorithms for Recommender Systems

In recent years, there has been a surge of interest in applying reinforcement learning as alternatives to supervised learning for recommender systems. In this talk, I will cover a couple of new applications of reinforcement learning in which these techniques are used to assist supervised learning based recommender systems in making a more holistic decision to optimize long term user experience on the platform. I will also go over our recent work in building recommender agents to optimize utilities of different entities of the platform beyond content consumers, whose behaviors are influenced by the recommender policy as well.

Minmin Chen is a Research Scientist in Google Brain, where she leads a team working on reinforcement learning for recommender systems. Before that, she was a research scientist at Criteo, working on computational models for online advertising, and Amazon, working on the Amazon Go project. She did her PhD at Washington University in St. Louis. She publishes at top machine learning and recommendation conferences such as Neurips, ICML, ICLR, RecSys and WSDM, and regularly serves as area chair at Neurips, ICML and ICLR.

Houssam Nassif, Senior Machine Learning Scientist at Amazon

Solving Inverse Reinforcement Learning, Bootstrapping Bandits, and Adaptive Recommendation

This talk discusses three different ways we leveraged reward signals to inform recommendation. In Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions (ICML’20), we use deep energy-based policies to recover the true reward function in an Inverse Reinforcement Learning setting. We uniquely identify the reward function by assuming the existence of an anchor action with known reward, for example a do-nothing action with zero reward. In Decoupling Learning Rates Using Empirical Bayes (under review, arXiv), we devise an Empirical Bayes formulation that extracts an unbiased prior in hindsight from an experiment’s early reward signals. We apply this empirical prior to warm-start bandit recommendations and speed up convergence. In Seeker: Real-Time Interactive Search (KDD’19), we introduce a recommender system that adaptively refines search rankings in real time, through user interactions in the form of likes and dislikes. We extend Boltzmann exploration to adapt to the interactively changing embedding space, and to factor-in the uncertainty of the reward estimates.

Houssam Nassif started his career as a wet-lab biologist, before switching to computer sciences, earning his PhD in Artificial Intelligence from the University of Wisconsin - Madison. His early research spans biomedical informatics, statistical relational learning, and uplift modeling. Since joining Amazon in 2013, Houssam has been passionate about adaptive experimentation. He established and leads Amazon’s adaptive testing framework, researching, deploying and evangelizing bandits, with forays into reinforcement learning, causality, and diversity. He helped launch 27 business products across Amazon, Google, and Cisco, which generated $1.5 billion incremental yearly revenue. Houssam has published over 25 peer-reviewed papers in leading ML and biomedical informatics conferences and journals, and organized AISTATS’15. His work has been recognized with 4 paper awards, including from RecSys’16 and KDD’17.