ICDM 2022 Workshop

Machine Learning on Higher-Order Structured data (ML-HOS)

Orlando, Florida

Description and CFP


Modern complex datasets and systems often involve interactions of multiple objects or entities, which are naturally represented by higher-order structures, such as tensors, simplicial complexes, and hypergraphs. Examples include datasets arising in online advertising, recommendation systems, knowledge graphs, gene regulation, cyber-physics systems, pandemic dissemination, to name a few. Such datasets can be very large, yet very sparse and imbalanced at the local level. For instance, the log of online advertising systems can include billions of click records. However, drilling down to individual users, we find most users only view a few ads, while a small number of very active users click far more. In addition, high-order structural data is often accompanied by valuable side information and rich metadata, such as time stamps when each interaction occurred. This leads to high-order event sequences or time series.

The goal of this workshop is to foster discussion on novel ideas, models, algorithms, and practice in developing effective and efficient machine learning and data mining approaches for high-order structured data, possibly with the integration of the side information. The workshop especially welcomes discussion on research that aims to address key challenges in dealing with higher-order data. For example, while deep learning has shown powerful in numerous applications, it is known that deep models favor training on dense data, and often overfit on datasets that are sparse and imbalanced, as higher-order structured datasets often are. As another example, there are often many approaches for mathematically generalizing standard graph models and methods to higher-order graphs, but it is often challenging to know which generalized models and methods are most effective in practice.

Topics of interest more broadly include, but are not limited to:

(1) Scalable and efficient algorithms to deal with large-scale, complex higher-order relations.

(2) Interpretable methods for learning and mining from high-order structured data.

(3) Models and methods that account for the common properties of data, such as power law distributions.

(4) Robust machine learning/deep learning methods to address the challenges of data imbalance and local sparsity.

(5) Deep tensor decomposition and higher-order graph embeddings

(6) Analysis and prediction from high-order interaction events and time series.

Important Dates

Paper Submission Deadline: Sep 17, 2022

Author Notification: Oct 05, 2022

Camera Ready: Oct 15, 2022

Workshop: Nov 28, 2022


Any questions?

Please reach mlhosworkshop@gmail.com.


ML-HOS Workshop Schedule

Event Room: Florida 5

(All times Eastern Time)

November 28, 2022


9:00-9:05

Opening Remarks

Workshop organizers


9:05-9:40

Keynote Presentation

Ingo Scholtes

9:45 - 9:55

Accepted Paper Presentation

Influence Maximization on Hypergraphs via Similarity-based Diffusion

Mehmet Aktas


9:55 - 10:05

Accepted Paper Presentation

Hybrid Oversampling Technique Based on Star Topology and Rejection Methodology for Classifying Imbalanced Data

Jaekwang Kim


10:05-10:20

Coffee break

10:20-10:55

Keynote Presentation

Mining of Real-world Hypergraphs: Patterns, Tools, and Generators

Kijung Shin


10:55-11:05

Accepted Paper Presentation

Exploiting Cross-Order Patterns and Link Prediction in Higher-Order Networks

Hao Tian


11:05-11:15

Accepted Paper Presentation

Dynamic Combination of Heterogeneous Models for Hierarchical Time Series

Jing Hu


11:20-11:55

Keynote Presentation

Hypergraph Co-Optimal Transport

Bei Wang

11:55-12:00

Closing Remarks

Workshop organizers


Keynote Talk Abstracts


Speaker: Ingo Scholtes

Title: Causality-Aware Higher-Order Graph Neural Networks for Time Series Data

Abstract: Graph Neural Networks (GNNs) have become a cornerstone for the application of deep learning to data on complex networks, i.e. relational data that capture interactions between nodes. However, we increasingly have access to time-resolved data that not only capture which nodes are connected to each other, but also when and in which temporal order those connections occur. A number of works have shown how the timing and ordering of links shapes the causal topology of networked systems, i.e. which nodes can possibly influence each other over time. Moreover, higher-order network models have been developed that allow us to model patterns in the resulting causal topology. While those works have shed light on the question how the time dimension of dynamic graphs influences node centralities, community structures, or the evolution of dynamical processes, we lack methods to incorporate those insights into state-of-the-art deep graph learning techniques.

Addressing this gap, we introduce De Bruijn Graph Neural Networks (DBGNNs), a novel time-aware graph neural network architecture for time-resolved data on dynamic graphs. Our approach accounts for temporal-topological patterns that unfold via causal walks, i.e. temporally ordered sequences of links by which nodes can influence each other over time. We develop a graph neural network architecture that utilizes De Bruijn graphs of multiple orders to implement a message passing scheme that follows a non-Markovian dynamics, which enables us to learn patterns in the causal topology of dynamic graphs.


Short Bio: Dr. Ingo Scholtes is a Professor for Machine Learning for Complex Networks at the Center for Artificial Intelligence and Data Science of Julius-Maximilians-Universität Würzburg. and is further an SNSF Professor in the Department of Informatics at University of Zurich, where he heads the Data Analytics Group (DAG). In 2014 he was awarded a Juniorfellowship from the German Informatics Society. In 2018 he was awarded an SNSF Professorship from the Swiss National Science Foundation. Since 2021 he has held the Chair of Computer Science XV - Machine Learning for Complex Networks at Julius-Maximilians-Universität Würzburg. His research addresses open questions at the intersection between machine learning, network science, graph mining, and computational social science.


Speaker: Kijung Shin

Title: Mining of Real-world Hypergraphs: Patterns, Tools, and Generators

Abstract: Group interactions are prevalent in various complex systems (e.g., collaborations of researchers and group discussions on online Q&A sites), and they are commonly modeled as hypergraphs. Hyperedges, which compose a hypergraph, are non-empty subsets of any number of nodes, and thus each hyperedge naturally represents a group interaction among entities. The higher-order nature of hypergraphs brings about unique structural properties that have not been considered in ordinary pairwise graphs.


In this talk, I'll offer a comprehensive overview of a new research topic called hypergraph mining. First, I'll present recently revealed structural properties of real-world hypergraphs, including (a) static and dynamic patterns, (b) global and local patterns, and (c) connectivity and overlapping patterns. Together with the patterns, I'll introduce advanced data mining tools used for their discovery. Lastly, I'll describe simple yet realistic hypergraph generative models that provide an explanation of the structural properties.


Short Bio: Kijung Shin is an Ewon Endowed Assistant Professor (jointly affiliated) in the Kim Jaechul Graduate School of AI and the School of Electrical Engineering at KAIST. He received his Ph.D. in Computer Science from Carnegie Mellon University in 2019. He has published more than 50 referred articles at major data mining venues, and he won the best research paper award at KDD 2016. His research interests span a wide range of topics on graph mining, with a focus on scalable algorithm design and empirical analysis of real-world hypergraphs.


Speaker: Bei Wang

Title: Hypergraph Co-Optimal Transport

Abstract: Hypergraphs capture multi-way relationships in data, and they have consequently seen a number of applications in higher-order network analysis, computer vision, geometry processing, and machine learning. We develop theoretical foundations for studying the space of hypergraphs using ingredients from optimal transport. By enriching a hypergraph with probability measures on its nodes and hyperedges, as well as relational information capturing local and global structures, we obtain a general and robust framework for studying the collection of all hypergraphs. First, we introduce a hypergraph distance based on the co-optimal transport framework of Redko et al. and study its theoretical properties. Second, we formalize common methods for transforming a hypergraph into a graph as maps between the space of hypergraphs and the space of graphs, and study their functorial properties and Lipschitz bounds. Finally, we demonstrate the versatility of our framework through various examples. This is a joint work with Youjia Zhou, Samir Chowdhury, Tom Needham and Ethan Semrad.


Short Bio: Dr. Bei Wang Phillips is an Associate Professor in the School of Computing and a faculty member in the Scientific Computing and Imaging (SCI) Institute, University of Utah. She obtained her Ph.D. in Computer Science from Duke University. Her research focuses on topological data analysis, data visualization, and computational topology. She works on combining topological, geometric, statistical, data mining, and machine learning techniques with visualization to study large and complex data for information exploration and scientific discovery. Some of her current research activities involve the analysis and visualization of high-dimensional point clouds, scalar fields, vector fields, tensor fields, networks, and multivariate ensembles. Dr. Phillips is a DOE Early Career Research Program (ECRP) awardee in 2020 and an NSF CAREER awardee in 2022.

The Speakers

Bei Wang

Associate Professor, School of Computing
Adjunct Associate Professor, Department of Mathematics
Faculty member, Scientific Computing and Imaging (SCI) Institute

University of Utah


Ingo Scholtes

Professor of Machine Learning for Complex Networks at the
University of Würzburg

SNSF Professor at the University of Zurich


Kijung Shin


Assistant Professor


Kim Jaechul Graduate School of AI

KAIST



Workshop Organizers

Shandian Zhe

Assistant Professor

University of Utah

Nate Veldt

Assistant Professor

Texas A&M University

Bin Shen

Senior Staff Engineer & Manager

Celonis AI Group

Kuang-Chih Lee

Director, Data Science

Alibaba, Inc