Workshop on Multimodal Understanding and Learning for Embodied Applications