Heterophilic Graph Learning

Handbook     Slides     Code Demo     Video     Feedback Form 

Speakers                                                                                  Code Assistant

Chenqing Hua

McGill University, Mila

Qincheng Lu

McGill University

Jiaqi Zhu

DRW Holdings, LLC 

Advisory Board

Guy Wolf

Associate Professor

Université de Montréal; Mila 

Xiao-Wen Chang

Associate Professor

McGill University


Introduction

The success of graph models, especially on node-level tasks, is commonly believed to be rooted in the homophily principle. Homophily is a concept in sociology and evolutionary biology, stating the tendency that individuals with similar characteristics are easier to communicate and bond with each other. In graph representation learning, it describes the networks where nodes with the same labels or similar attributes are more likely to be connected.  Such relational inductive bias is thought to be a major contributor to the superiority of GNNs over traditional Neural Networks (NNs) on various tasks.

On the other hand, the lack of homophily, i.e. heterophily, is considered as the main cause of the inferiority of GNNs on heterophilic graphs, and such performance degradation is widespread. People have revisited and re-evaluated lots of existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, heterophily problem is broadly relevant to numerous graph-related applications (shown in Figure 1-6), e.g. graph fraud/anomaly detection, point cloud segmentation, urban computing, recommender systems, scene graph generation, drug discovery,  brain network analysis, link prediction, privacy, graph adversarial attacks and robustness, etc. In the past few years, considerable efforts have been devoted to studying and addressing the heterophily issue.

In this tutorial, we will systematically summarize the latest progress on heterophilic graph learning and deliver them in reader-friendly way. The contents include (1) categorization of benchmark datasets and evaluation of homophily metrics on synthetic graphs, (2) meticulous classification of the most updated supervised and unsupervised learning methods, (3) theoretical analysis on homophily/heterophily, (4) heterophily-related applications and (5) challenges and future directions for heterophilic graph learning. 

Fraud detection in Uber driver-rider network: Red nodes represent fraud users, blue nodes represent legitimate users.

Figure 1: Fraud detection with heterophily relations in Uber driver-rider network. Red nodes represent fraud users, blue nodes represent legitimate users. (Source)

Figure 2: Homophily distribution of molecules in QM9. The  homophilic bond  links atoms of the same type, while a heterophilic bond links different types of atoms. (Source)

Figure 3: Spatial heterophily in urban graph. (Source)

Figure 4: Local heterophilic structures for the segmentation of point cloud. The red circle is an enlarged display of the segmentation boundary. (Source)

Figure 5: Scene Graph Generation. (Source)

Figure 6: In brain network, both heterophily and homophily coexist, which appear in (a) and (c).(Source)

Example

As shown in Figure 6, for the indiscernible boundary nodes, homophilic structure provides extra useful information into their aggregated features over the original node features. Such relational inductive bias is thought to be a major contributor to the superiority of GNNs over traditional NNs on various tasks. On the other hand, in Figure 7, the boundary nodes have more heterophilic neighbors than homophilic ones. Since heterophilic edges connect nodes of different classes, they can lead to mixed and indistinguishable node embeddings, making the classification task more difficult for GNNs. 

Figure 6: Message passing on homophilic graph

 (indistinguishable nodes -->message passing--> distinguishable nodes)

Figure 7: Message passing on heterophilic graph

(distinguishable nodes -->message passing--> indistinguishable nodes)

Goal

The objective of this tutorial is to: (1) let people be aware of heterophily problem and understand that not all graph structures are useful or beneficial for learning; (2) help our audiences to review the latest heterophily-specific graph models quickly and comprehensively; (3) summarize real-world applications and assist our audiences on their future research about related topics.

Schedule (90min)


Introduction and Background Knowledge (Sitao Luan, 5min)

Homophily Metrics (Sitao Luan, code demo by Qincheng Lu and Jiaqi Zhu, 10min)

Benchmark (Qincheng Lu, code demo by Qincheng Lu and Jiaqi Zhu, 15min)

Models for Heterophily (Sitao Luan, 15min)

Theoretical Studies (Sitao Luan, 15min)

Homophily/Heterophily Related Applications (Sitao Luan, 15min)

Challenges and Future Directions (Chenqing Hua, 15min)

Targeted Audience

The beginner level of graph theory and graph representation learning knowledge is needed for our targeted audience.

Citations

@article{luan2024heterophily,

title={Are Heterophily-Specific GNNs and Homophily Metrics Really Effective? Evaluation Pitfalls and New Benchmarks},

author={Luan, Sitao and Lu, Qincheng and Hua, Chenqing and Wang, Xinyu and Zhu, Jiaqi and Chang, Xiao-Wen and Wolf, Guy and Tang, Jian},

journal={arXiv preprint arXiv:2409.05755},

year={2024}}

@article{luan2024heterophilic,

title={The heterophilic graph learning handbook: Benchmarks, models, theoretical analysis, applications and challenges},

author={Luan, Sitao and Hua, Chenqing and Lu, Qincheng and Ma, Liheng and Wu, Lirong and Wang, Xinyu and Xu, Minkai and    Chang, Xiao-Wen and Precup, Doina and Ying, Rex and others},

journal={arXiv preprint arXiv:2407.09618},

year={2024}}

Contact Information

Speakers

Sitao Luan: sitao.luan@mail.mcgill.ca

Chenqing Hua: chenqing.hua@mail.mcgill.ca

Qincheng Lu: qincheng.lu@mail.mcgill.ca


Code Assistant

Jiaqi Zhu: jiaqi.zhu@mail.mcgill.ca