My work tackles the development of multi-modal deep generative models to learn multi-sensory fusion. In the context of artificial intelligence, my approach contributes to unsupervised curiosity-driven learning of active sensing for a robot fleet equipped with visual, depth, and proximity sensors.