From Vision to Language in Virtual Environments