Idea
More extension on Multi-player Online Learning
Multi-player bandit XX
Multi-player linear bandit/RL/ContextualBandit?Fundamental problems
Distributed algorithm
Restricted communication
Mutual influence
Multi-player MAB in Other Applications
Multi-player MAB in sharing economy
Movement cost
Random supply of customersMulti-player MAB in Edge computing
Mobility of usersMulti-player MAB in crowdsourcing
Workers on an arm needs to collaborate
Skill complement?Multi-player MAB in Multi-path TCP
Multi-player MAB in distributed cloud system like Alibaba
Multi-player MAB in Recommender Systems
Sharing Economy Inspired Multi-player MAB
Cooperative Multi-player MAB
Divide reward using the optimal reward as a value function
Use shapley value to divide the reward
Can be used to model the centralized multi-player MAB problem2021.03.16: the assignment problem (PPT)
Two important factors: (1) restriction of accessing arms; (2) switch or movement cost2021.03.15: Discussion with John (PPT)
Comment1: switch or movement cost
Comment2: each player is restricted a subset of arms
Comment3: different penalty factors for wasting players and wasting demands respectively
Comment4: change of location after delivering
Comment5: preference among players and arms
Multi-player MAB
Key element 1: distributed learning of the model
What information can assist distributed learning?
What's the fundamental challenges in distributed learning?
Key element 2: distributed coordination to the optimal solution
What's the fundamental challenge in distributed coordination?
What information can assist distributed coordination?
Key element 3: incentive to induce collaboration or competition
The challenge is that this significantly affect the information or singal in distributed learning or distributed coordination.