neurips2020-lio

neurips2020-lio

Learning to Incentive Other Learning Agents

The following are examples of learned behavior by LIO, actor-critic (AC) and inequity aversion (IA) agents in the 7x7 and 10x10 Cleanup maps.

7x7 Cleanup map

LIO

LIO agents find a division of labor - the purple agent specializes to become a "river cleaner", while the blue agent becomes an "apple harvester".

Actor-Critic

Actor-critic (AC) agents sometimes perform cleaning, but both immediately compete for apples.

Inequity Aversion

Inequity aversion (IA) agents cooperate to a certain extent, but do not exactly specialize to be a "cleaner" or "harvester".

10x10 Cleanup map

LIO

Actor-Critic

Inequity Aversion

Page updated

Google Sites

Report abuse