Call for Papers

Submission deadline: July 28th, 2018 August 1st, 2018

CMT submission site: https://cmt3.research.microsoft.com/OFFLINEEVAL2018

One of the main goals of offline metrics for recommender systems is to indicate the future online performance of the same recommender system in an online setting, as measured by various user-based utility metrics, such as the time spent by a user on the website, the number of media items consumed, the number of attributed sales or the reported user satisfaction. However, practitioners often observe significant differences between offline and online results of a new algorithm, and therefore tend to mostly rely on online methods such as A/B testing to evaluate their algorithms.

To this end, we welcome contributions that advance the current state of knowledge on offline recommendation metrics that correlate well with the final performance of the online recommender system; or new recommendation algorithms that directly optimize for online metrics.

We invite submissions of 2-8 pages to be presented as talks or posters. The reviews will be single-blind.

Potential contributions include (but are not limited to):

  • Framing the problem: what are we trying to solve exactly? This includes work on the theoretical foundations of the recommendation task and on the corresponding offline metrics:
    • Recommendation as a counterfactual inference problem. This encompasses all work on new offline metrics and new optimization criteria for recommendation using ideas from causal inference, such as:
      • Learning from data Missing Not at Random (MNAR)
      • Counterfactual Risk Minimization (CRM) and Batch Learning from Bandit Feedback (BLBF)
      • Deconvolving recommendation-lead from organic feedback in logged data
      • Causal Inference using domain adaptation
    • Recommendation as a reinforcement learning problem. This is aimed to all work that frames recommendation as a reinforcement learning task and that borrows ideas from RL on evaluating recommendation policies.
      • The use of simulation in recommender systems evaluation.
  • Studies on offline-online metrics correlation for Recommendation.
  • More realistic offline metrics
    • Offline metrics for slate recommendation evaluation
    • Offline metrics for logged data with sequential recommendation exposure and delayed feedback
  • Datasets and toolkits
    • New exploration schemes to improve the collection of more informative offline datasets.
    • Open datasets. All recommendation dataset releases that aim to somehow bridge the gap between offline and online metrics.
    • Toolkits. Including new software for evaluation metrics, reproducible research and baseline algorithms.