Idea

More extension on Multi-player Online Learning

Multi-player bandit XX
Multi-player linear bandit/RL/ContextualBandit?
Fundamental problems
Distributed algorithm
Restricted communication
Mutual influence

Multi-player MAB in sharing economy
Movement cost
Random supply of customers
Multi-player MAB in Edge computing
Mobility of users
Multi-player MAB in crowdsourcing
Workers on an arm needs to collaborate
Skill complement?
Multi-player MAB in Multi-path TCP
Multi-player MAB in distributed cloud system like Alibaba
Multi-player MAB in Recommender Systems

Cooperative Multi-player MAB
Divide reward using the optimal reward as a value function
Use shapley value to divide the reward
Can be used to model the centralized multi-player MAB problem
2021.03.16: the assignment problem (PPT)
Two important factors: (1) restriction of accessing arms; (2) switch or movement cost
2021.03.15: Discussion with John (PPT)
Comment1: switch or movement cost
Comment2: each player is restricted a subset of arms
Comment3: different penalty factors for wasting players and wasting demands respectively
Comment4: change of location after delivering
Comment5: preference among players and arms

Multi-player MAB
Key element 1: distributed learning of the model
What information can assist distributed learning?
What's the fundamental challenges in distributed learning?

Key element 2: distributed coordination to the optimal solution
What's the fundamental challenge in distributed coordination?
What information can assist distributed coordination?

Key element 3: incentive to induce collaboration or competition
The challenge is that this significantly affect the information or singal in distributed learning or distributed coordination.

Page updated

Google Sites

Report abuse