KDD 2023 Tutorial

KDD 2023 Tutorial: Practical design of performant recommender systems using large-scale linear programming-based global inference

Using linear programs to generate effective recommendations

We would love to hear from you about your motivation to attend this tutorial! Please fill in this short pre-event questionnaire. Thank you!

Abstract

Several key problems in web-scale recommender systems, such as optimal matching and allocation, can be formulated as large-scale linear programs (LPs). These LPs take predictions from ML and AI models such as probabilities of click, view, like, etc. as inputs and optimize the recommendations made to the users. In recent years, there has been an explosion in the research and development of large-scale recommender systems, but effective optimization of business objectives using the output of those systems remains a key challenge. Although LPs can help optimize such business objectives, and algorithms for solving LPs have existed since the 1950s, generic LP solvers cannot handle the scale of these problems. At LinkedIn, we have developed algorithms that can solve LPs of various forms with trillions of variables and packaged them in a Spark-based library called “DuaLip”. DuaLip is a novel distributed solver that solves a perturbation of the LP problem at scale via gradient-based algorithms on the smooth dual of the perturbed LP, with computational guarantees. DuaLip has been deployed in production at LinkedIn and powers several very large-scale recommender systems. DuaLip is open-sourced, easy to use, extensible in terms of features and algorithms, and can be integrated easily into other ML pipelines.

In this first-of-its-kind tutorial, we will motivate the application of LPs to improve the performance of recommender systems, cover the theory of key LP algorithms, and introduce DuaLip, a highly performant and scalable Spark-based library that solves extreme-scale LPs for a large variety of web-scale recommender system problems. We will describe some practical successes of large-scale LP in the industry. We will conduct a hands-on exercise to demonstrate how one can easily use DuaLip in recommender applications. Finally, we will discuss future work, including various interesting research and development directions.