Hopper-v3

Hopper-v3: It considers three objective functions including the forward speed (f1), the jumping height (f2), and the

energy consumption (f3). The definitions of the state space and the action space of Hopper-v3 are as follows:

The non-dominated policies obtained by different algorithms with different preferences are shown blow. We also demonstrate the video clips of the best policy found by PBMORL.

Non-dominated policies obtained by PBMORL versus PGMORL, RA with different preferences on each objective.

Non-dominated policies obtained by PBMORL versus MORL-Adaptation, META-MORL, MOMPO and MORAL (f1 is preferred).

Non-dominated policies obtained by PBMORL versus MORL-Adaptation, META-MORL, MOMPO and MORAL (f2 is preferred).

Non-dominated policies obtained by PBMORL versus MORL-Adaptation, META-MORL, MOMPO and MORAL (f3 is preferred).

Page updated

Google Sites

Report abuse