Heterophilic Graph Learning

Handbook Slides Code Demo Video Feedback Form

Speakers Code Assistant

Sitao Luan

Mila

Chenqing Hua

McGill University, Mila

Qincheng Lu

McGill University

Jiaqi Zhu

DRW Holdings, LLC

Advisory Board

Guy Wolf

Associate Professor

Université de Montréal; Mila

Xiao-Wen Chang

Associate Professor

McGill University

Introduction

The success of graph models, especially on node-level tasks, is commonly believed to be rooted in the homophily principle. Homophily is a concept in sociology and evolutionary biology, stating the tendency that individuals with similar characteristics are easier to communicate and bond with each other. In graph representation learning, it describes the networks where nodes with the same labels or similar attributes are more likely to be connected. Such relational inductive bias is thought to be a major contributor to the superiority of GNNs over traditional Neural Networks (NNs) on various tasks.

On the other hand, the lack of homophily, i.e. heterophily, is considered as the main cause of the inferiority of GNNs on heterophilic graphs, and such performance degradation is widespread. People have revisited and re-evaluated lots of existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, heterophily problem is broadly relevant to numerous graph-related applications (shown in Figure 1-6), e.g. graph fraud/anomaly detection, point cloud segmentation, urban computing, recommender systems, scene graph generation, drug discovery, brain network analysis, link prediction, privacy, graph adversarial attacks and robustness, etc. In the past few years, considerable efforts have been devoted to studying and addressing the heterophily issue.

In this tutorial, we will systematically summarize the latest progress on heterophilic graph learning and deliver them in reader-friendly way. The contents include (1) categorization of benchmark datasets and evaluation of homophily metrics on synthetic graphs, (2) meticulous classification of the most updated supervised and unsupervised learning methods, (3) theoretical analysis on homophily/heterophily, (4) heterophily-related applications and (5) challenges and future directions for heterophilic graph learning.

Fraud detection in Uber driver-rider network: Red nodes represent fraud users, blue nodes represent legitimate users.

Figure 1: Fraud detection with heterophily relations in Uber driver-rider network. Red nodes represent fraud users, blue nodes represent legitimate users. (Source)

Figure 2: Homophily distribution of molecules in QM9. The homophilic bond links atoms of the same type, while a heterophilic bond links different types of atoms. (Source)

Figure 3: Spatial heterophily in urban graph. (Source)

Figure 4: Local heterophilic structures for the segmentation of point cloud. The red circle is an enlarged display of the segmentation boundary. (Source)

Figure 5: Scene Graph Generation. (Source)

Figure 6: In brain network, both heterophily and homophily coexist, which appear in (a) and (c).(Source)

Example

As shown in Figure 6, for the indiscernible boundary nodes, homophilic structure provides extra useful information into their aggregated features over the original node features. Such relational inductive bias is thought to be a major contributor to the superiority of GNNs over traditional NNs on various tasks. On the other hand, in Figure 7, the boundary nodes have more heterophilic neighbors than homophilic ones. Since heterophilic edges connect nodes of different classes, they can lead to mixed and indistinguishable node embeddings, making the classification task more difficult for GNNs.

Figure 6: Message passing on homophilic graph

(indistinguishable nodes -->message passing--> distinguishable nodes)

Figure 7: Message passing on heterophilic graph

(distinguishable nodes -->message passing--> indistinguishable nodes)

Goal

The objective of this tutorial is to: (1) let people be aware of heterophily problem and understand that not all graph structures are useful or beneficial for learning; (2) help our audiences to review the latest heterophily-specific graph models quickly and comprehensively; (3) summarize real-world applications and assist our audiences on their future research about related topics.

Schedule (90min)

Introduction and Background Knowledge (Sitao Luan, 5min)

- Definition of homophily/heterophily
- Be careful of your graph: why do we have to consider heterophily in graph learning
- Homophily metrics

Homophily Metrics (Sitao Luan, code demo by Qincheng Lu and Jiaqi Zhu, 10min)

- Graph-Label consistency
- Similarity based metrics
- Neighborhood identifiability/informativeness
- Hypothesis testing based performance metrics
- Evaluation methods with synthetic graphs

Benchmark (Qincheng Lu, code demo by Qincheng Lu and Jiaqi Zhu, 15min)

- Benign, malignant and ambiguous heterophily datasets
- Evaluation of homophily metrics on synthetic graphs

Models for Heterophily (Sitao Luan, 15min)

- Supervised learning on homogeneous graphs

Theoretical Studies (Sitao Luan, 15min)

- Mid-homophily pitfall
- Homophily and distribution shifts
- Other interesting findings

Homophily/Heterophily Related Applications (Sitao Luan, 15min)

- Fraud/anomaly detection
- Graph Learning Tasks
- Computer vision
- Urban Computing
- Brain Network Analysis

Challenges and Future Directions (Chenqing Hua, 15min)

- Heterophily on heterogeneous GNNs
- Heterophily on temporal GNNs
- Heterophily on Hypergraphs
- Fairness
- LLM and foundation models
- Molecule design

Targeted Audience

The beginner level of graph theory and graph representation learning knowledge is needed for our targeted audience.

Citations

If you use the code, please cite the original sources:

@article{luan2024heterophily,

title={Are Heterophily-Specific GNNs and Homophily Metrics Really Effective? Evaluation Pitfalls and New Benchmarks},

author={Luan, Sitao and Lu, Qincheng and Hua, Chenqing and Wang, Xinyu and Zhu, Jiaqi and Chang, Xiao-Wen and Wolf, Guy and Tang, Jian},

journal={arXiv preprint arXiv:2409.05755},

year={2024}}

If you use the contents in the tutorial, please cite the original sources:

@article{luan2024heterophilic,

title={The heterophilic graph learning handbook: Benchmarks, models, theoretical analysis, applications and challenges},

author={Luan, Sitao and Hua, Chenqing and Lu, Qincheng and Ma, Liheng and Wu, Lirong and Wang, Xinyu and Xu, Minkai and Chang, Xiao-Wen and Precup, Doina and Ying, Rex and others},

journal={arXiv preprint arXiv:2407.09618},

year={2024}}

Related Repository

Awesome Resources on Graph Neural Networks With Heterophily

A Survey on Learning from Graphs with Heterophily: Progress and Future

Contact Information

Speakers

Sitao Luan: sitao.luan@mail.mcgill.ca

Chenqing Hua: chenqing.hua@mail.mcgill.ca

Qincheng Lu: qincheng.lu@mail.mcgill.ca

Code Assistant

Jiaqi Zhu: jiaqi.zhu@mail.mcgill.ca