Selected Projects

(VLSI Design & Computer-aided Design)

SHyPar: Multilevel Hypergraph Partitioning via Spectral Coarsening

SHyPar significantly improves the solution quality of state-of-the-art multilevel hypergraph partitioner on ISPD98 and Titan23 datasets (VLSI design benchmarks) with unit weights (left figure: ϵ = 2%, k = 2; right figure: ϵ = 1% k = 4)

Our recent work introduces SHyPar (TCAD'25) for partitioning large-scale hypergraphs by leveraging hyperedge effective resistances and flow-based community detection techniques. Inspired by the latest theoretical spectral hypergraph clustering frameworks, SHyPar aims to decompose large hypergraphs into multiple subgraphs with few inter-partition hyperedges (cut size). A key component of SHyPar is a flow-based local clustering scheme for hypergraph coarsening, which incorporates a max-flow-based algorithm to produce clusters with substantially improved conductance. Additionally, SHyPar utilizes an effective resistance-based rating function for merging strongly connected (coupled) nodes. Compared with existing state-of-the-art hypergraph partitioning methods, our extensive experimental results on real-world VLSI designs demonstrate that SHyPar can more effectively partition hypergraphs, achieving state-of-the-art solution quality.

GRASS: Practically-Efficient Algorithms for Spectral Graph Sparsification

Our recent work (DAC'16, DAC'18, TCAD'20) introduces GRASS for extracting ultra-sparse subgraphs that can well mimic the original graph by preserving the key Laplacian eigenvalues and eigenvectors. Our latest work SF-GRASS (ICCAD'20) introduces a more scalable solver-free spectral sparsification framework. The extracted subgraphs (running GRASS with a relative condition number K) will have the following desired properties:

Any effective-resistance distance on the subgraph will be no greater than K times the original resistance distance. So spectrally-similar subgraphs can well preserve the original resistance distances.
Any cut in the original graph will be no greater than K times the cut in the subgraph. So spectrally-similar subgraphs can well preserve the original cuts.
The sparsified graph Laplacian matrix approximately preserves the original set of eigenvalues and eigenvectors, which can be leveraged as a preconditioner (with a relative condition number K) for solving the original Laplacian matrix using iterative methods (ICCAD'17, TCAD'20, TCAD'21).

inGRASS: Incremental Spectral Sparsification of Graphs and Circuit Networks

Our recent work (DAC'24) introduces inGRASS, a novel algorithm for incremental spectral sparsification of large undirected graphs that undergo frequent updates. The proposed algorithm has a nearly linear time complexity for the setup phase and can update the spectral sparsifier in O(log N) time for each incremental edge insertion to the original graph with N nodes. A key ingredient of the setup phase of inGRASS is a multilevel resistance embedding framework for efficiently identifying critical edges and detecting redundant ones leveraging a low-resistance-diameter (LRD) decomposition scheme. The update phase of inGRASS exploits low-dimensional node embedding vectors for efficiently estimating the importance and uniqueness of each newly added edge. As demonstrated through extensive experiments, inGRASS achieves up to over 200X speedups while retaining comparable solution quality in incremental spectral sparsification of graphs obtained from various datasets, such as circuit simulations, finite element analysis, and social networks.

diGRASS: Directed Graph Spectral Sparsification via Laplacian Symmetrization

The diGRASS-estimated spectral sensitivities of off-subgraph edges (e1 to e19) indicate that for the directed graph (right) only including e19 into the initial subgraph (with green edges) will more significantly decrease the average (max) distances between nodes than e1, while for the undirected one (left) both edges (e1 and e19) are equally effective.

In our latest work (TKDD'24), we introduce an efficient spectral algorithm (diGRASS) for sparsifying real-world, large-scale directed graphs leveraging a spectrum-preserving Laplacian symmetrization approach. The proposed sparsification method has been evaluated using a variety of directed graphs obtained from real-world applications, showing promising results for solving directed graph Laplacians, spectral partitioning of directed graphs, and approximately computing (personalized) PageRank vectors.

HyperEF: Spectral Hypergraph Coarsening by Effective-Resistance Clustering

Comparison with hMetis for hypergraph decomposition

Our latest work (ICCAD'22) introduces a scalable algorithmic framework (HyperEF) for spectral coarsening (decomposition) of largescale hypergraphs by exploiting hyperedge effective resistances. Motivated by the latest theoretical framework for low-resistance diameter decomposition of simple graphs, HyperEF aims at decomposing large hypergraphs into multiple node clusters with only a few inter-cluster hyperedges. The key component in HyperEF is a nearly-linear time algorithm for estimating hyperedge effective resistances, which allows incorporating the latest diffusion-based non-linear quadratic operators defined on hypergraphs. HyperEF can more effectively coarsen (decompose) hypergraphs, achieving over 70× runtime speedups over hMetis and 20× speedups over HyperSF.

HyperSF: Spectral Coarsening (Reduction) of (Un)directed Graphs & Hypergraphs

Our latest work (ICCAD'21) introduces HyperSF, an efficient spectral method for coarsening hypergraphs. HyperSF leverages a recent strongly-local flow-based clustering algorithm for detecting the sets of hypergraph vertices that minimize ratio cut. To further improve the algorithm efficiency, we leverage spectral clustering of the bipartite graphs corresponding to the original hypergraphs. Our experimental results for a variety of hypergraphs extracted from real-world VLSI design benchmarks show that the proposed hypergraph coarsening algorithm can significantly improve the multi-way conductance of hypergraph clustering as well as runtime efficiency when compared with existing state-of-the-art algorithms. Our related works on spectral coarsening of regular undirected graphs can be found at (DAC'19, WSDM'21).

SGL: Learning Circuit Networks for Data-driven, Physics-informed EDA

The SGL-learned circuit networks can well preserve the spectral properties and effective resistances

Our latest work (DAC'21) introduces SGL, a highly-scalable spectral algorithm for learning large resistor networks from linear measurements, such as node voltages and currents. We show that given O(logN) pairs of voltage and current measurements, it is possible to recover N-node resistor networks for well preserving the original effective-resistance distances. The learned graphs are ultra sparse and will preserve the structural (spectral) properties of the original graph, which can be leveraged in downstream circuit modeling, simulation, and optimization tasks.

Multilevel Vectorless Power/Thermal Integrity Verification Frameworks

A data-driven framework for vectorless power/thermal integrity verification

Our recent work (DAC'17, DATE'20, TODAES'22) introduces a general vectorless integrity verification framework for computing the worst-case voltage drops or temperature (gradient) distributions across the entire chip under a set of local and global workload (power density) constraints. To address the computational challenges introduced by the large power grids or 3D mesh-structured thermal grids, we propose a novel spectral approach for highly-scalable vectorless verification of large chip designs leveraging a hierarchy of almost linear-sized spectral sparsifiers of input grids that can well retain effective resistances between nodes. As a result, the vectorless integrity verification solution obtained on coarse level problems can effectively help compute the solution of the original problem by leveraging our prior works (TVLSI'13, DAC'13a, DAC'10b, DAC'10c, TCAD'11b).

Graph Sparsification for Integrated Circuit Analysis in Time/Frequency Domains

Graph sparsification for scalable harmonic balance (HB) analysis of strongly-nonlinear RF & microwave circuits

In the past decades, harmonic balance (HB) has been widely used for computing steady-state solutions of nonlinear radio-frequency (RF) and microwave circuits. However, using HB for simulating strongly nonlinear post-layout RF circuits still remains a very challenging task. In our recent work (TCAD'15), we present a novel graph sparsification approach for automatically generating preconditioners that can be efficiently applied for simulating strongly nonlinear post-layout RF circuits. Our approach starts with sparsifying time-domain circuit networks that can be subsequently leveraged for sparsifying the entire HB Jacobian matrix. We show that the resultant sparsified Jacobian matrix can be used as a robust yet efficient preconditioner in HB analysis. Our experimental results show that when compared with the prior state-of-the-art direct solution method, the proposed solver can more efficiently handle moderate to strong nonlinearities during the HB analysis of RF circuits, achieving up to 20× speedups and 6× memory reductions. Graph sparsification has also been investigated for general SPICE simulations that involve various nonlinear devices (TCAD'15a, TCAD'15b, ICCAD'13, ICCAD'12, DAC'12).

Spectral hypergraph coarsening leveraging flow-based local clustering

EDA Algorithms for CPU-GPU Heterogeneous Parallel Computing Platforms

A GPU-based hybrid multigrid solver for large-scale on-chip power grid analysis

TinySPICE for massively parallel transistor-level circuit simulations on GPUs

To enable the paradigm shift of EDA to energy-efficient heterogeneous parallel manycore computing systems, my research group has been investigating heterogeneous parallel algorithms for scalable modeling, simulation, and verification of nanoscale ICs. By inventing/redesigning heterogeneous computing algorithms and data structures, as well as exploiting hardware and algorithm-specific performance modeling and optimization approaches, our recent work also complements ongoing research for multicore/GPU computer architecture, compiler design/optimization, and parallel library development, through heavy use of domain knowledge in EDA and IC designs. More specifically, a coherent set of compute-intensive design automation problems key to designing nanoscale microprocessors, 3D-ICs, analog, and mixed-signal circuits as well as radio-frequency (RF) and microwave ICs has been investigated:

Hybrid multigrid solver on GPUs for power grid analysis (TCAD'11a, TVLSI'11, DAC'10a, ICCAD'08)
Fast multipole method (FMM) on GPU for capacitance extraction (DAC'11)
SPICE and harmonic balance (HB) simulators on GPU (ICCAD'16, DAC'15, DAC'13b)
Fast thermal simulator on GPU for 3D-ICs (TVLSI'13a, ICCAD'11b)

Variation-aware Integrated Circuit Modeling and Analysis

Performance-oriented parameter dimensional reduction for statistical integrated circuit modeling & analysis

Our prior work introduced a performance-oriented parameter dimension reduction framework to reduce the modeling complexity associated with high parameter dimensionality. Our framework has a theoretically sound statistical basis, namely, reduced rank regression (RRR) and its various extensions that have been introduced for more practical VLSI circuit modeling tasks. For a variety of VLSI circuits including interconnects and CMOS digital circuits, it is shown that the proposed parameter reduction framework can provide more than one order of magnitude reduction in parameter dimensionality. Our work immediately led to dramatically reduced simulation cost in sampling-based performance analysis, and more importantly, highly efficient parameterized sub-circuit models that are instrumental in tackling the complexity of variation-tolerance VLSI system design. The related works can be found at (TCAD'09, TVLSI'09, TCAD'08, DAC'07a, DAC'07b, ICCAD'07b, ICCAD'07a, TVLSI'07, ICCAD'06b)

Page updated

Google Sites

Report abuse