Learning with Muscles: Benefits for Data-Efficiency and Robustness in Anthropomorphic tasks

Isabell Wochner*, Pierre Schumacher*, Georg Martius, Dieter Büchler, Syn Schmitt, Daniel F.B. Haeufle

Humans are able to outperform robots in terms of robustness, versatility, and learning of new tasks in a wide variety of movements. We hypothesize that highly nonlinear muscle dynamics play a large role in providing inherent stability, which is favorable to learning. While recent advances have been made in applying modern learning techniques to muscle-actuated systems both in simulation as well as in robotics, so far, no detailed analysis has been performed to show the benefits of muscles when learning from scratch. Our study closes this gap and showcases the potential of muscle actuators for core robotics challenges in terms of data-efficiency, hyperparameter sensitivity, and robustness.

Smooth Point-Reaching (OC)

results_arm26_muscles_mpc_M_3_N_30_bobyqa_smoothpointreaching-90-90_randoms15_controlres_0_01_2022_05_06_123204.mkv

Muscle-actuated motion

results_arm26_torques_mpc_M_3_N_30_bobyqa_smoothpointreaching-90-90_randoms15_controlres_0_01_2022_05_06_112336.mkv

Torque-actuated motion

A smooth point-reaching motion with the arm26 model is compared with muscle actuators and torque actuators. The green lines in the muscle-actuated motion represent the force exerted by the muscles, the larger the line, the larger the muscle force.

Squatting (OC)

results_allminsimplelegs_muscles_mpc_M_3_N_20_bobyqa_squatjowa_randoms15_controlres_0_01_2022_04_22_175844.mkv

Muscle-actuated motion

results_allminsimplelegs_torques_mpc_M_3_N_20_bobyqa_squatjowa_randoms15_controlres_0_01_2022_04_22_175727.mkv

Torque-actuated motion

A squatting motion using a reduced FullBody model in 3D is shown. The muscle-actuated motion seems more stable towards the endpoint: This can especially be seen due to the smaller swinging motion of the freely movable arms.

Hitting a ball with high velocity (OC)

results_arm26ball_muscles_cmaes_fastballvel_randoms16_controlres_0_15_sigma_0.2_2022_06_12_024332.mkv

Muscle-actuated motion

results_arm26ball_torques_cmaes_fastballvel_randoms16_controlres_0_15_sigma_0.2_2022_06_12_031521.mkv

Torque-actuated motion

The goal was to hit the falling ball with a high-velocity in z-direction (against gravity). In both cases, the controller learns to wait shortly at the beginning until the arm can hit the ball in the right position at the optimal time.

High-Jumping (OC)

results_allminsimplelegs_muscles_cmaes_jumphigh_pandy_randoms15_controlres_0_3_sigma_0.2_2022_06_09_084014.mkv

Muscle-actuated motion

results_allminsimplelegs_torques_cmaes_jumphigh_pandy_randoms14_controlres_0_05_sigma_0.2_2022_06_09_000859.mkv

Torque-actuated motion

A high-jumping motion using a reduced FullBody model in 3D is shown. The muscle-actuated motion seems more physiological because it is less likely that the joint limits are reached (see pelvis motion of torque-actuated case at the beginning and end of movement).

Point-reaching with perturbation (MPC)

Overlay_arm26_muscles_addw1kglowerarm.mkv

Muscle-actuated motion

Overlay_arm26_torques_addw1kglowerarm.mp4

Torque-actuated motion

An unknown weight of 1 kg was added to the lower arm. This perturbed motion is shown in comparison to the unperturbed motion (overlayed transparently). It can be seen that in both cases the controller is still able to perform the motion, although it is slower compared to the original motion. The deviation to the desired endpoint at the end of the movement is larger in the torque-actuated case, compared to the muscle-actuated motion.

Squatting with perturbation (MPC)

results_allminsimplelegs_muscles_addperturb-50N_mpc_M_3_N_20_bobyqa_squatjowa_randoms15_controlres_0_01_2022_05_07_065345.mkv

Muscle-actuated motion

results_allminsimplelegs_torques_addperturb-50N_mpc_M_3_N_20_bobyqa_squatjowa_randoms15_controlres_0_01_2022_05_07_063629.mkv

Torque-actuated motion

An unknown perturbation force of -50 N was added to the hip angle in the middle of the motion. In the muscle-actuated case, almost no effect is seen, whereas the controller struggles to keep the model upright in the torque-actuated case.

Precise point-reaching (RL)

reaching_muscle.mp4

Muscle-actuated motion

reaching_torque.mp4

Torque-actuated motion

Chaotic load point-reaching (RL)

reaching_muscle_ball.mp4

Muscle-actuated motion

reaching_torque_ball.mp4

Torque-actuated motion

Hopping (RL)

biped_muscle_hopping.mp4

Muscle-actuated motion

jumping_torque.mp4

Torque-actuated motion

OC: Optimal control

MPC: Model predictive control

RL: Reinforcement learning