I am currently a doctoral research associate at the Knowledge Technology Group at the University of Hamburg. My research focuses on developing data-efficient deep reinforcement learning algorithms for robot motor control by applying biological principles of self-organization and intrinsic motivation. I also work on meta-decision making, strategy selection and algorithms that integrate model-based and model-free control for robot skill learning. My research interests include:
- Neural Networks
- Reinforcement Learning
- Cognitive Robotics
Efficient Intrinsically Motivated Robotic Grasping with Learning-Adaptive Imagination in Latent Space
M. B. Hafez et al. (2019). ICDL-EpiRob.
Inspired by human mental simulation of motor behavior and its role in skill acquisition, we show that:
(1) The sample efficiency of learning vision-based robotic grasping can be greatly improved by performing experience imagination in a learned latent space and using the imagined data for training grasping policies.
(2) The proposed adaptive imagination, where imagined rollouts are generated with probability proportional to the prediction reliability of the local world model, outperforms static imagination and makes the imagination depth adaptively determined using spatially and temporally local information provided by the average prediction error in the traversed latent-space regions.
(3) Using intrinsic reward based on model learning progress leads to data that improves future predictions necessary for imagination.
Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
M. B. Hafez et al. (2019). IJCNN.
In this paper, we show that using a curiosity feedback based on prediction learning progress to arbitrate between model-based and model-free decisions accelerates learning pixel-level control policies.
Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning
M. B. Hafez et al. (2019). J. Behav. Robot.
This work demonstrates that spatially and temporally local learning progress in a growing ensemble of local world models provides an effective intrinsic reward, enabling directed exploration for vision-based grasp learning on a developmental humanoid robot. The work also suggests that training a small actor network on low-dimensional feature representations learned for self-reconstruction and reward prediction leads to a fast and stable learning performance.