Haifeng Qian (钱海峰)

Dr. Qian is a Manager and Senior Applied Scientist in the AWS AI Labs. Prior to this, he was a Research Staff Member in the Foundations of AI Department and the Design Automation Department at IBM T. J. Watson Research Center. He received his Ph.D. degree from University of Minnesota and Bachelor's degree from Tsinghua University.

He has expertise and achievements across domains: large language models (LLM) for code generation; efficient inference of transformers; robustness, generalization and security of neural networks; design automation of high performance microprocessors; numerical analysis with large sparse matrices. He has delivered large-scale and mission-critical softwares that have optimized generations of IBM microprocessors and he leads a team of applied scientists that help build the LLM-based AWS service CodeWhisperer.

Google Scholar

Accomplishments

IBM Corporate Award (highest technical honor in company)
IBM Outstanding Technical Achievement Award
IBM Sixth Invention Plateau
Associate Editor, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
ACM Outstanding Ph.D. Dissertation Award in Electronic Design Automation
IBM Ph.D. Fellowship
Best Paper Award, ACM/IEEE Design Automation Conference
Two-time Silver Medalist, Chinese Mathematical Olympiad

Selected Projects

AWS CodeWhisperer

Public Preview: June 2022

Generally Available: April 2023

https://aws.amazon.com/codewhisperer

ReCode: Robustness Evaluation of Code Generation Models, ACL 2023

paper | code

LLMs tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in real-life applications, are not well understood. Most existing works have focused on classification,while robustness in generation tasks is an uncharted area. We propose ReCode, a comprehensive robustness evaluation benchmark for code generation models. We customize over 30 transformations specifically for code on docstrings, function and variable names, code syntax, and code format. They are carefully designed to be natural in real-life coding practice, preserve the original semantic meaning, and thus provide multifaceted assessments of a model’s robustness performance. We define robustness metrics for code generation models considering the worst-case behavior, taking advantage of the fact that executing the generated code can serve as objective evaluation.

Neural Belief Reasoner, IJCAI 2020

paper | models | poster | video

NBR is an ambitious project and is designed as a model for System 2 with many desirable properties: unsupervised learning, reasoning with uncertainty, generation, compositional generalization. It's only limited by computing resources. Indeed, on a synthetic unsupervised learning task, NBR reasons like a human.

The complexity for supervised learning is practical and NBR advances the state of the art in adversarial robustness. The results suggest that robustness comes from reasoning, a.k.a., ensemble of classifiers with special relations. This inspires a follow-up work that advances adversarial robustness even further.

L2-Nonexpansive Neural Networks, ICLR 2019

paper | models | poster | openreview

Adversarial examples prove that neural networks are ill-conditioned systems. What if we build neural networks with Lipschitz constant no more than 1?

L2NNN is the best technique because 1) it ensures Lipschitz bound over the entire input space, and not just at training data points; 2) its Lipschitz bound is strict, and not by estimation like a few power iterations; 3) it preserves expressiveness of neural networks by new regularization and new nonexpansive nonlinearities.

L2NNN advanced the state of the art in adversarial robustness at its time, and become an essential component in follow-up works. Its openreview page unintentionally hosted a heated debate on the role of robustness in AI applications.

Besides robustness, L2NNN shows superb generalization: when 75% of the training labels are random noise, an L2NNN can still achieve 93.1% accuracy. It has a controllable trade-off between memorization (fitting the training labels) and generalization (learning generalizable features), and such control is unheard-of in other neural networks.

Global Clocking Methodology: A Design Environment for Industry-leading High Frequency Global Clocks

Winner of IBM Corporate Award (highest technical honor in company)

paper 1 | paper 2

Clock networks are critical to the performance of microprocessors as they directly impact their operating frequency. High-frequency low-skew clock networks that are robust to process, voltage, and temperature variations had always been custom or semicustom designs. Is it even possible for design automation softwares to match or exceed expert human designers on this mission-critical task?

The automated methodology delivers clock delivery networks that operate at up to 7 GHz and have below 5 picosecond skew within 500μm Manhattan distance and below 10 picosecond skew across each clock grid. The softwares have been used to design all IBM microprocessors since 2009.

Power Optimization of Microprocessors

Trades off timing, dynamic power, leakage power throughout the physical design flow: fast signoff-quality power analysis, gate sizing and threshold voltage optimization both before and after routing, net-switching-aware placement, latch and clock buffer sizing, and switching-aware logic restructuring. The softwares have been used to optimize generations of IBM microprocessors. Paper on a part of it.

Stochastic Preconditioning of Large Sparse Matrices

paper in SIAM Journal on Scientific Computing

early paper 1 (Best Paper Award) | early paper 2

software download

Solving large sparse linear systems is the computation bottleneck in many applications, and the quality of preconditioner is often the dominant factor to speed.

This is a radically-new preconditioning approach that uses random walks to build an incomplete factorization. It outperforms other methods dramatically on diagonally dominant matrices, which arise from finite-difference methods, circuit simulation, VLSI placement optimization, power grid simulation, information retrieval (e.g. PageRank), and many other applications. It achieves condition numbers in the single digits, which are often 100x lower than competitors, and for example Krylov-subspace iterative solvers need much fewer iterations to converge.

Techniques are presented that extend the theory to some non-diagonally-dominant matrices, though without convergence guarantees.

Fast Poisson Solvers for Thermal Analysis

paper 1 | paper 2

Fast Poisson solvers (FPS) can solve finite-difference 2-D domains with Dirichlet boundary condition in O(NlogN) time. That's cute, but can we use it to solve real-life problems?

This project:

developed FPS that solves 3-D domains with mixed boundary conditions and layered materials, and that still solves in O(NlogN) time;
proved relation between FPS and Green function-based methods: frequency-domain truncation is equivalent to space-domain sampling;
developed a FPS-Preconditioned Conjugate Gradient method that performs large scale chip-level thermal analysis with non-rectangular 3-D domains and that is 15X faster than known methods.

Blink Proximity

paper

Humans have intuitions for graph proximity. Given a graph, certain pairs of nodes are perceived to have stronger relation strength than others. In the picture on the left, most humans say that A has stronger relation to B2 than to B1, yet no previous proximity metric says so. Is it possible to design a metric that matches human intuition all the time?

Blink proximity is such a metric. It is tested on two link prediction tasks that reflect human behaviors: on Wikipedia, predict new inter-wikipage citation links added in a one-year period; on arXiv, predict new coauthorship relations formed in a two-year period. On both tasks, the Blink metric substantially outperforms competitors.