Workshop on Multi-Task Learning in Computer Vision
ICCV 2021


Despite the recent progress in deep learning, most approaches still go for a silo-like solution, training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. In this full-day workshop, we aim to provide a well-rounded view of recent trends in multi-task learning, while also identifying the current challenges in the field. More specifically, we aim to examine a variety of subtopics under the multi-task learning setup, including network architecture designs, neural architecture search, optimization strategies, task transfer relationships, meta-learning, single-tasking of multiple tasks, etc.

With the organization of this workshop, we hope to bring together a diverse group of researchers that have worked on multi-task learning, and raise attention at large to further investigate a topic that has been mostly under-explored by the computer vision community.

Update: The recordings of our invited talks are now available on Youtube.


The workshop will be hosted as a virtual event on October 16th, 2021. A tentative schedule can be found here. The workshop will be accessible through the ICCV platform.

Accepted Papers

We accepted full papers of maximum 8 pages. Authors will present their work during a live talk at the workshop.

The papers can be found in the ICCV workshop proceedings. A list of accepted papers is provided below.

The papers will be presented live during the workshop, timings are indicated in CET.

  • 15:10 Concurrent Discrimination and Alignment for Self-Supervised Feature Learning; Anjan Dutta (University of Exeter)*; Massimiliano Mancini (University of Tübingen); Zeynep Akata (University of Tübingen)

  • 15:15 Multi-Modal RGB-D Scene Recognition Across Domains; Andrea Ferreri (Politecnico di Torino); Silvia Bucci (Italian Institute of Technology)*; Tatiana Tommasi (Politecnico di Torino)

  • 15:20 In Defense of the Learning Without Forgetting for Task Incremental Learning; Guy Oren (Tel Aviv University)*; Lior Wolf (Tel-Aviv University)

  • 15:25 MILA: Multi-Task Learning from Videos via Efficient Inter-Frame Attention; Donghyun Kim (Boston University)*; Tian Lan (amazon); Chuhang Zou (Amazon); Ning Xu (Kuaishou Technology); Bryan Plummer (Boston University); Stan Sclaroff (Boston University); Jayan Eledath (Amazon); Gerard Medioni (USC)

  • 15:30 ConvNets vs. Transformers: Whose Visual Representations are More Transferable?; Hong-Yu Zhou (The University of Hong Kong)*; Chixiang Lu (Huazhong University of Science and Technology); Sibei Yang (ShanghaiTech University); Yizhou Yu (The University of Hong Kong)

  • 15:35 UniNet: A Unified Scene Understanding Network and Exploring Multi-Task Relationships through the Lens of Adversarial Attacks; Naresh Kumar Gurulingan (NavInfo Europe); Elahe Arani (Navinfo Europe )*; Bahram Zonooz (Navinfo Europe)

  • 15:40 Audio-Visual Transformer Based Crowd Counting; Usman Sajid (University of Kansas)*; Xiangyu Chen (The University of Kansas); Hasan Sajid (National University of Sciences and Technology); Taejoon Kim (University of Kansas); Guanghui Wang (Ryerson University)

  • 15:45 Distribution-aware Multitask Learning for Visual Relations; Alakh Desai (UCSD); Tz-Ying Wu (UCSD)*; Subarna Tripathi (Intel Labs); Nuno Vasconcelos (UCSD, USA)


  • Multi-task learning architectures and optimization

  • Neural architecture search for multi-task learning

  • Semi- and weakly-supervised multi-task learning

  • Active learning for multi-task learning

  • Multi-domain learning

  • Auxiliary task learning

  • Domain Adaptation

  • Multi-task learning for robustness

  • Lifelong learning

  • Applications


  • Thierry Deruyttere, KU Leuven

  • Matthias De Lange, KU Leuven

  • Dusan Grujicic, KU Leuven

  • Anthony Hu, University of Cambridge

  • Eugene Lee, National Chiao Tung University

  • Shikun Liu, Imperial College London

  • Nicola Marinello, KU Leuven

  • Anton Obukhov, ETH Zurich

  • Vaishakh Patil, ETH Zurich

  • Arun Balajee Vasudevan, ETH Zurich

  • Eli Verwimp, KU Leuven

  • Xinshuo Weng, Carnegie Mellon University

  • Zhenyu Zhang, Nanjing University of Science and Technology

Invited Speakers

Rich Caruana

Chelsea Finn
(Stanford University)

Judy Hoffman
(Georgia Tech)

Iasonas Kokkinos
(University College London)

Andrew Rabinovich
(Headroom Inc.)

Raquel Urtasun
(Waabi Innovation Inc. &
University of Toronto


Dengxin Dai
(MPI for Informatis & ETH Zurich)

Luc Van Gool
(KU Leuven & ETH Zurich)


We would like to acknowledge support by Toyota via the TRACE project and MACCHINA (KU Leuven, C14/18/065). This initiative is also sponsored by the Flemish Government under the Flemish AI program. Finally, we thank the people that helped us with the paper review process (see call for papers).