A. I am new to RL and MARL. There are so many topics in MARL research‚ which one should I choose?
B. I come from RL research. How to quickly catch the main difference between MARL and RL?
C. I have a new idea on MARL and already implemented it‚ which MARL algorithms should I compare with as the baseline?
D. Multi-agent environments/tasks are so diverse. Which one should I choose as the start point or the testing bed?
E. If my version of baseline MARL algorithms implementation can not reach the performance/learning curve reported before‚ should I report my result instead of following the original one?
F. In the rebuttal stage‚ reviewers suggest I validate my idea on other MARL tasks. But apparently‚ it is impossible to get all the results including the baseline and the new one in a limited time. What should I do?
A. Good enough.
B. Usually restricted to one family of algorithms (centralized critic) or tasks (cooperative task e.g. SMAC).
C. The framework is easy to follow but hard to extend.
D. The sampling/training efficiency is not good enough.
E. There is no hardware-level optimization (distributed/parallel computing).
F. The implementation details of the same algorithm from different benchmarks are not the same.
G. The performance of the same algorithm reported from different benchmarks is not the same/comparable.
A. It provides a solution on unifying the multi-agent tasks under one framework.
B. It covers a large variety of MARL algorithms
C. The implementation detail/final performance can be trusted.
D. The new environment / new idea / extra function can be easily tested and added‚ with minimal coding work.
E. The training/sampling efficiency is optimized both software-level and hardware-level.
F. It provides you with the trained model / learning curve to save your time.