Competitive Policy Optimization
A novel policy gradient approach that exploits the game-theoretic nature of competitive games to derive policy updates
This page contains experimental videos comparing,
This page contains experimental videos comparing,
Gradient Descent Ascent(GDA) with Competitive Policy Gradient (CoPG),
Gradient Descent Ascent(GDA) with Competitive Policy Gradient (CoPG),
Trust Region Gradient Descent Ascent (TRGDA) with Trust Region Competitive Policy Optimization (TRCoPO).
Trust Region Gradient Descent Ascent (TRGDA) with Trust Region Competitive Policy Optimization (TRCoPO).
Experiment: Car Racing
Experiment: Car Racing
GAIL Case Study
Experiment: Rock Paper Scissors
Experiment: Rock Paper Scissors
GDA vs GDA CoPG vs CoPG
GDA vs GDA CoPG vs CoPG
TRGDA vs TRGDA TRCoPO vs TRCoPO
TRGDA vs TRGDA TRCoPO vs TRCoPO
Experiment: Markov Soccer
Experiment: Markov Soccer
GDA vs GDA CoPG vs CoPG
GDA vs GDA CoPG vs CoPG
TRGDA vs TRGDA TRCoPO vs TRCoPO
TRGDA vs TRGDA TRCoPO vs TRCoPO
Experiment: Matching Pennies
Experiment: Matching Pennies
GDA vs GDA CoPG vs CoPG
GDA vs GDA CoPG vs CoPG
TRGDA vs TRGDA TRCoPO vs TRCoPO
TRGDA vs TRGDA TRCoPO vs TRCoPO