Language-driven Multimodal Intelligence Lab