Vision to Language Tasks

Action Recognition

Fine-Grained Action Recognition

Multi-Label Classification

Visual Representation