Abstract
GNNs can be efficiently implemented within a message passing framework. Nodes receive messages from their 1-hop neighbors, where the messages are defined as parametrized functions of the node and edge features. This design defines the core of the GNN models that can be supervised for node, link or graph prediction tasks. GNNs face challenges related to scalability and effectiveness. Real-world graphs scale to trillions of edges, so modern graph machine learning systems need to provide scalable graph construction methods, fast training and inference of graph models, support real-time inference requirements, and scalable storage of graphs. In this talk, we will introduce Deep Graph Library (DGL), one of the most complete and heavily used open source libraries for developing GNN models. We will dive deep into the design philosophy of DGL, how it advances the research of GNNs and the new features of the upcoming v1.0 stable release. Lastly, we will share how we use and extend DGL to bring impact of the GNN technology across Amazon and AWS.
Abstract
GNNs have seen a lot of academic interest in recent years and have shown a lot of promise for many real-world applications from fraud and abuse detection to recommendations. Yet, industry-wide adoption of GNN techniques to these problems have been lagging behind. As such, there is a strong need for tools and frameworks that help researchers develop GNNs for large scale graph machine learning problems, and help machine learning practitioners deploy these models for production use cases. The relatively slow adoption of GNNs in industry is a result of the unique set of challenges that need to be solved to scale GNNs for industrial applications. We detail these challenges, including i) scaling GNNs to giant graphs, including distributed training on billion node graphs ii) scaling GNNs with rich and heterogeneous node level features, including joint training for GNNs and large language models (LLMs) and iii) scaling GNNs within a business driven machine learning (ML) workflow for real time inference and batch predictions with graph databases. We discuss how we tackle these challenges at Amazon using frameworks like DGL, Dist-DGL and Neptune ML that take away the heavy lifting necessary for productionizing GNNs. We dive to the challenges of developing and deploying GNN models for production at a company like Amazon and how AWS delivers Graph ML solutions to its customers.
Time: 4:00 pm CDT
Presenter: Zak Jost
Title: Diving deep in GNNs for Fraud Prediction
Abstract
This talk will deep dive the problem of fraud detection and tell the story of bringing a GNN model to production in a real fraud detection system. In doing so, we will discuss why machine learning and graphs are useful for fraud systems, and more broadly the many challenges faced when crossing the chasm between research and a functional production system.