The practice of building high performance personalized interactive systems is founded on an artful combination of diverse machine learning methodologies including collaborative filtering, content based recommendation, contextual bandits, off policy estimation, click models, attribution and A/B testing. While combining these methods has been spectacularly successful, each methods finds justification using its own stylized protocols and there is little attention to the over-aching principles behind building reward optimizing recommender systems. This tutorial will focus on inferential principles including causal inference and relate these principles to current best practice in machine learning. The inferential principles covered will include Bayesian decision theory, coherence, the likelihood and conditionality principle as well as causal principles such as ignoreability, the do-calculus, Rubin Causal model, randomization and A/B testing. These foundations will then be used to investigate the differences between recommender systems best practices and approaches directly informed by inferential and causal principles, this section will both challenge best practice and the applicability of academic approaches. A lot of attention will be given to the pervasive problem of self confounding which will cover how engineering and machine learning best practice often results in production systems that unnecessarily suffer from confounding.
Tutorial Structure
Inference
• Probability Theory and Statistics, Bayesian Decision Theory, Models, Decisions and Decision Rules
• Coherence and Exchangeability, Cox Axioms
• TheLikelihood Principle, The Conditionality Principle
• Generative AI, LLMs and World Models
• Decisions, Decision Rules, Models and Estimators
• Machine Learning Examples
• Cross-validation, Simulation, Benchmarks, Anything Goes
Causality
• Causal Theories, Do Calculus, Rubin Causal Model
Ignore ability, Simpson’s Paradox, SUTVA, Confounding,
Causal Inference is Inference
• Multiple Opportunities to Personalize
• Propensity Scores and Balancing Scores
• Confounding and Self Confounding in Real Systems
Applying the Principles to Interactive Systems
• A/BTesting, ignoreability in interactive systems, models, decisions and decision rules in real systems, likelihood for real systems, attribution heuristics, simulation, reinforcement learning
• Confounding is a Pervasive Problem in Recommender Systems
• Single Opportunity to Personalize (contextual bandits)
• Non-reward Signals, Combining reward and non-reward
• Contextual Bandit when the utility is not a sum over users
• Click Models
• Propensity Scores, for De-Confounding and Off Policy Estimation
• Avoiding Self-Confounding with Propensity Scores, The Back Door Rule and Reinforce
Bio
David Rohde is the research lead of the Performance Science Team at Criteo (currently on sabbatical). This tutorial will draw heavily on the (in preparation) book David is currently writing Reward Optimizing Recommender Systems. David’s research focus is on causality, Bayesian inference and recommender systems. He has frequently presented on topics related to causal inference, Bayesian inference and its relation to recommender systems.