Contrastive Learning: A Heterogeneous Perspective

SDM23 Tutorial

Abstract


In many real-world applications, data are usually collected from multiple sources and meanwhile are characterized with multiple labels, thus exhibiting the coexistence of multiple types of heterogeneity. To this end, many state-of-the-art techniques heavily rely on abundant label information, thus leading to a sub-optimal performance in the scenario with insufficient labels. To address this issue, many researchers pay great attention to contrastive learning due to its prominent performance by utilizing rich unlabeled data for improving performance. However, the potential drawbacks of contrastive learning in some scenarios may deteriorate the performance of the model, such as the class collision problem and the introduced bias depending on the choice of data augmentation methods. Many recent works have been proposed to tackle these challenges and further improve the performance of contrastive learning-based methods. This tutorial aims to provide a concise review of state-of-the-art techniques for contrastive learning in the heterogeneous setting, e.g., multi-view setting, multi-label setting, and multi-task setting. In particular, we first present a general overview of contrastive learning from a heterogeneous perspective. Then, we focus on multi-view contrastive learning and multi-label contrastive learning. Finally, we conclude this tutorial by discussing the potential challenges and shedding light on the future directions of heterogeneous contrastive learning.


Tutorial Outline

Introduction (15 Minutes)

In this part, we start with the definition of heterogeneous learning and self-supervised learning, specifically the definition of multi-view learning, multi-label learning, and contrastive learning. Then, we give the motivation of incorporating contrastive learning with heterogeneous learning as well as the potential challenges. Next, we show the overview of the tutorial and divide it into two parts, including contrastive learning in the multi-view setting, contrastive learning in the multi-label or multi-task setting.

Part I: Contrastive Learning Meets Multi-view Learning (45 Minutes)

In this part, we start from the unsupervised multi-view contrastive learning-based methods, which aim to maximize the similarity between two views of the same sample. In particular, we start with the two types of unsupervised multi-view contrastive learning-based methods, e.g., unsupervised contrastive learning with data augmentation and unsupervised contrastive learning with multiple views. Then, we move to the supervised contrastive learning methods that take label information into consideration. In particular, we provide the challenges that unsupervised learning-based methods are facing and how supervised learning-based methods address these challenges.

  • Unsupervised multi-view contrastive learning: In this task, we start from the unsupervised multi-view contrastive learning based methods with data augmentation (e.g., MoCo, SimCLR), which utilizes different augmentation methods to generate two views to learning the compact representations, including MoCO and SimCLR. Then, we move to unsupervised contrastive learning-based methods dealing with the existence of multiple views.

  • Supervised multi-view contrastive learning: In this task, we first introduce the issue of class collision of contrastive learning, and then we discuss how the supervised contrastive learning-based methods address this issue by utilizing the label information. In particular, the section will cover both the weakly supervised contrastive learning methods and supervised contrastive learning methods.

Part II: Contrastive Learning Meets Multi-label Learning (45 Minutes)

In this part, we present the recent advances in contrastive learning in the context of multi-label learning and multi-task learning. Different from multi-view contrastive learning which aims to explore the similarity of samples at the feature level, contrastive multi-label and multi-task learning pay attention to the label information and explore the relatedness of the different labels or tasks. Contrastive transfer learning-based methods adopt the idea of traditional contrastive learning by pulling the samples with similar semantic information closer in the latent space.

  • Contrastive multi-label learning: In this task, we introduce contrastive learning-based methods in the context of multi-label learning. Specifically, we focus on how contrastive multi-label learning-based methods explore the relatedness of different labels or tasks.

  • Contrastive multi-task learning: In this task, we focus on multi-task learning-based methods, especially transfer learning. Transfer learning aims to find the latent representation of the samples from the source domain and transfer it to the target domain. In particular, we discuss how contrastive learning is applied to extract knowledge from one task and transfer it to another task. The topic includes contrastive pre-training models and transfer learning models.


Future Direction (15 Minutes)

In this part, we conclude the discussion of existing works and share our thoughts regarding the future directions of heterogeneous contrastive learning as follows.

  • Complementary information in multi-view learning: Current multi-view contrastive learning-based methods only aim to extract the common information. However, in multi-view learning, the complementary assumption is important as one view may contain the information complementary to another view. How to extract the complementary information in the context of the contrastive learning method is a challenging research question.

  • Complex scenarios in contrastive multi-label learning: The existing contrastive multi-label learning-based methods only naively attract samples with similar label vectors closer. How to model the correlation of the different labels in more complicated scenarios is still a great challenge, such as the extreme multi-label scenario and the imbalanced multi-label scenario.

  • Dual Heterogeneity: Most heterogeneous contrastive learning-based methods only target a single heterogeneity, while many real-world data usually exhibit dual heterogeneity, e.g., the co-existence of view heterogeneity and task heterogeneity. Thus, how can we model dual heterogeneity via contrastive learning?

Presenters

Lecheng Zheng

Lecheng Zheng is currently a Ph.D. candidate at the Department of Computer Science, University of Illinois at Urbana-Champaign. His current research interests include heterogeneous learning, self-supervised learning, and graph mining. He has published several top-tier papers at major conferences (e.g., AAAI, KDD, WWW, CIKM, SDM) and 1 journal. He has served on the program committee for many major conferences, including Knowledge Discovery and Data Mining (KDD), the Web Conference (WWW), International Joint Conference on Artificial Intelligence (IJCAI), Association for the Advancement of Artificial Intelligence (AAAI), International Conference on Information and Knowledge Management (CIKM), and SIAM International Conference on Data Mining (SDM). For more information, please refer to his homepage.

Jingrui He

Jingrui He is currently an Associate Professor at School of Information Sciences, University of Illinois at Urbana-Champaign. She received her Ph.D. from Carnegie Mellon University in 2010. Her research focuses on heterogeneous machine learning, rare category analysis, active learning and semi-supervised learning, with applications in social network analysis, healthcare, and manufacturing processes. Dr. He is the recipient of the 2016 NSF CAREER Award, 3 times recipient of the IBM Faculty Award in 2018, 2015 and 2014 respectively, and was selected as IJCAI 2017 Early Career Spotlight. Dr. He has more than 100 publications at major conferences (e.g., IJCAI, AAAI, KDD, ICML, NeurIPS) and journals (e.g., TKDE, TKDD, DMKD), and is the author of two books. Her papers have received the Distinguished Paper Award at FAccT 2022, as well as Bests of the Conference at ICDM 2016, ICDM 2010, and SDM 2010. Dr. He has served on the senior program committee/program committee at KDD, IJCAI, AAAI, ICML, etc. She has multiple years of course teaching experience as the instructor and has offered several tutorials at major conferences, e.g., KDD, AAAI, IJCAI, SDM, IEEE BigData, etc. in the past few years. For more information, please refer to her homepage.