Idea

More extension on Multi-player Online Learning

  • Multi-player bandit XX
    Multi-player linear bandit/RL/ContextualBandit?

  • Fundamental problems
    Distributed algorithm
    Restricted communication
    Mutual influence

Multi-player MAB in Other Applications

  • Multi-player MAB in sharing economy
    Movement cost
    Random supply of customers

  • Multi-player MAB in Edge computing
    Mobility of users

  • Multi-player MAB in crowdsourcing
    Workers on an arm needs to collaborate
    Skill complement?

  • Multi-player MAB in Multi-path TCP

  • Multi-player MAB in distributed cloud system like Alibaba

  • Multi-player MAB in Recommender Systems

Sharing Economy Inspired Multi-player MAB

  • Cooperative Multi-player MAB
    Divide reward using the optimal reward as a value function
    Use shapley value to divide the reward
    Can be used to model the centralized multi-player MAB problem

  • 2021.03.16: the assignment problem (PPT)
    Two important factors:
    (1) restriction of accessing arms; (2) switch or movement cost

  • 2021.03.15: Discussion with John (PPT)
    Comment1:
    switch or movement cost
    Comment2: each player is restricted a subset of arms
    Comment3: different penalty factors for wasting players and wasting demands respectively
    Comment4: change of location after delivering
    Comment5: preference among players and arms

  • Multi-player MAB
    Key element 1: distributed learning of the model
    What information can assist distributed learning?
    What's the fundamental challenges in distributed learning?

    Key element 2: distributed coordination to the optimal solution
    What's the fundamental challenge in distributed coordination?
    What information can assist distributed coordination?

    Key element 3: incentive to induce collaboration or competition
    The challenge is that this significantly affect the information or singal in distributed learning or distributed coordination.