🛠️ Projects (from first-author papers)
Early-Exit Graph Neural Networks
Link Prediction with Physics-Inspired Graph Neural Networks
GATSY: Music Artist Similarity with Graph Attention Networks
Graph Neural Re-Ranking via Corpus Graph
The Challenge: The "Depth Dilemma" Deep learning on graphs faces a difficult trade-off: shallow networks fail to capture long-range dependencies ("under-reaching"), while deep networks suffer from over-smoothing and over-squashing. Furthermore, fixed-depth architectures are inefficient; "easy" nodes simply do not require the same computational expense as "hard" nodes. Depth is a delicate hypeparameter.
The Solution: Dynamic Depth & Stable Dynamics I proposed EEGNN, the first end-to-end differentiable framework that allows nodes to halt inference the moment they are confident.
Smart Exits: I utilized a Gumbel-Softmax mechanism to let the network learn when to stop processing per node. Unlike existing early-exit models that rely on fixed heuristics, this allows the exit strategy to be learned end-to-end, based directly on the task loss.
Stable Backbone (SAS-GNN): To ensure deep layers remain useful, I introduced the Symmetric-Anti-Symmetric (SAS) backbone. This ODE-inspired architecture uses symmetric weights to induce attraction/repulsion (preventing smoothing) and anti-symmetric weights to preserve energy (preventing squashing).
The Impact
Efficiency: Drastically reduced inference latency and energy costs by adaptively processing easier inputs with fewer layers.
Performance: Matched or outperformed complex Attention-based and Asynchronous models (showing up to 4% improvement) on heterophilic and long-range benchmarks.
Scalability: Achieved these results with constant memory usage and significantly fewer parameters (100x less) than state-of-the-art competitors like Co-GNN or Polynormer.
The Challenge: The Heterophily Gap Standard Graph Neural Networks rely on message passing, which tends to make connected nodes look similar (over-smoothing). This is a major issue for heterophilic graphs (e.g., dating apps or transaction networks, where connected nodes differ). While physics-inspired models exist for node classification, applying them to Link Prediction, where the underlying reason for a connection might be latent, remained an unsolved challenge.
The Solution: Gradient Flows & A Novel Readout I developed GRAFF-LP, a framework that treats GNN layers as a discretization of a gradient flow, minimizing a learnable energy function.
Physics-Inspired Backbone: The model uses symmetric weights, ensuring interpretable behavior. I proved that GRAFF-LP dynamically induces attraction (for similar nodes) and repulsion (for dissimilar nodes).
Gradient-Based Readout: I introduced a novel readout function based on edge gradients rather than raw node features. Unlike the standard Hadamard product, this function directly quantifies the "potential" of a query link, allowing the model to explicitly learn to separate existing edges from non-existing ones based on their gradient dynamics.
The Impact
State-of-the-Art: Outperformed baselines in 3 out of 4 heterophilic datasets (including Roman Empire and Amazon Ratings), consistently ranking in the top 2 and surpassing complex subgraph-based methods like ELPH and NCNC (improvements up to 12.4%).
New Metric: I proposed a new metric based on edge gradients to measure how well a model separates true links from false ones. This metric allowed us to discover that GRAFF-LP, due to its physics priors learn to distinguish edges, by separating their gradients, and indeed GRAFF-LP optimizes this metric more effectively than any other model (up to 65.52% improvements).
Universal Improvement: My proposed readout function improved the performance of other baseline models as well, proving it captures fundamental signal properties that standard readouts miss.
Efficiency: Achieved these results while remaining the lightest model in its class, maintaining constant parameter complexity regardless of depth via weight sharing.
The Challenge: Subjectivity & Data Scarcity Defining "similarity" between music artists is inherently subjective and difficult to quantify. Furthermore, existing GNN recommenders rely heavily on expensive, hand-crafted features (like audio statistics). When these features are unavailable, standard models struggle to find meaningful patterns, leading to poor analysis.
The Solution: Attention-Driven Structural Learning I introduced GATSY, a GNN framework designed to learn artist similarities by integrating cultural and musicological context. Unlike traditional models, GATSY shifts reliance away from raw audio features, focusing instead on the structural relationships between artists.
Attention Mechanism: By leveraging multi-head attention, GATSY learns to weigh neighbor importance dynamically. This allows it to navigate heterophilic connections (where similar artists have different labels/music genres) more effectively than standard message passing.
The Impact
State-of-the-Art: GATSY outperformed GraphSAGE and other baselines (GCN, GIN) on both reduced and full datasets, achieving an nDCG of 0.5664 (vs. 0.5023 for the best baseline).
Robustness: While previous models lost up to ~75% of their performance when hand-crafted features were removed, GATSY dropped by only 11%, proving its ability to learn effectively from graph topology alone.
Data Enrichment: We augmented the existing OLGA dataset with new label information to better measure graph heterophily, contributing a richer benchmark to the research community.
Efficiency: Achieved these results while using 46% fewer parameters than competing architectures like GraphSAGE.
Recommender System: We show how we can use GATSY as a recommender system and how to easily include new artists in the graph without degrading the recommendation quality.
The Challenge: The Isolation Problem in Ranking Modern neural re-rankers are powerful, but they typically suffer from a major limitation: they evaluate query-document pairs in isolation (univariate scoring). This means they neglect the broader context provided by the distribution of other documents in the list. By ignoring how documents relate to one another, these models miss crucial signals that could refine the true relevance of a candidate.
The Solution: Context-Aware Ranking via Graphs I proposed Graph Neural Re-Ranking (GNRR), a pipeline that transforms the re-ranking task into a graph learning problem.
Corpus Subgraphs: Instead of processing a flat list, GNRR constructs a query-induced corpus subgraph, modeling the relationships between candidate documents based on both lexical and semantic similarities.
Hybrid Scoring: The architecture combines traditional "individual" relevance signals (from TCT-ColBERT) with "local" structural signals extracted by a GNN. This allows the model to adjust a document's score based on the quality and relevance of its neighbors.
The Impact
Performance Boost: Integration of the GNN module led to a 5.8% relative improvement in Average Precision on the TREC-DL19 benchmark compared to the strong TCT-ColBERT baseline.
Broad Effectiveness: Validated across multiple datasets (TREC-DL19, DL20, and DLHard), proving that capturing cross-document interactions consistently enhances ranking quality.
Versatility: The pipeline is agnostic to the specific GNN architecture, showing improvements with various backbones like GCN, GraphSAGE, and GAT.