Email: <last><first> AT gmail DOT com
I build Waymo's machine learning infrastructure. Before transferred to Waymo, I wrote compilers at Google for machine learning:
- gpucc: an open-source, fully functional, high performance CUDA compiler. It has powered numerous GPU applications at Google and is integrated into LLVM. I presented this work on behalf of my team in LLVM Dev 2015, CGO 2016, and GTC 2016. The slides are here.
- XLA: a domain-specific compiler for linear algebra. It presently lives under TensorFlow and accelerates TensorFlow computations.
I completed my PhD in Computer Science at Columbia University. I worked with Professor Junfeng Yang on several projects related to software reliability and programming languages. During my years at Columbia, I interned at Microsoft Research Silicon Valley and Facebook working on DryadLINQ and HHVM respectively.
I used to be an active programming competitor. I won the fifth place in China National Olympiad in Informatics (Chinese equivalent of USACO) back in high school. I attended two ACM/ICPC World Finals, Topcoder, and Google Code Jam in my college and graduate school years. I coached Columbia's programming contest team.