Wenlei Bao

Email: wenlei.bao@gmail.com

Address: Bellevue, WA

I received my Ph.D. from Department of Computer Science and Engineering, for my work at HPC Research Lab, at The Ohio State University. My advisor is Prof. P. Sadayappan, and I also worked closely with Dr. Sriram Krishnamoorthy and Dr. Louis-Noël Pouchet. My research interests include High performance & Parallel Computing, Compiler Optimizations, Polyhedral Compilation. I am currently working on AI Infrastructure.

News

Join Apple.
Join Microsoft AI Framework Team.
Defense on April 2018.
Attending POPL'18 at LA.
Attending HiPEAC'17 at Sweden.

Experience

Aug. 2020 to Present: Apple.
Jun. 2018 to Aug. 2020 : Microsoft AI Framework, Developing Compiler-based, High-performance AI Inference Engine, Bellevue, WA.
Jun. to Dec. 2017 : Nvidia Internship, Optimizing Convolution Neural Network (CNN) on GPU, Redmond, WA.
May to Jul. 2015 : Pacific Northwest National Laboratory (PNNL) Internship, Program Verification, Richland, WA.
May to Aug. 2014 : Pacific Northwest National Laboratory (PNNL) Internship, Energy Optimization, Richland, WA.

Publications

NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques. Wenlei Bao, Li-Wen Chang, Yang Chen, Ke Deng, Amit Agarwal, Emad Barsoum and Abe Taha. arXiv.org, 2019.
Accelerating Recurrent Neural Networks through Compiler Techniques and Quantization. Li-Wen Chang, Yang Chen, Wenlei Bao, Amit Agarwal, Eldar Akchurin, Ke Deng, Emad Barsoum. Workshop on Systems for ML at NIPS, 2018
Analytical Modeling of Cache Behavior for Affine Programs. Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, P. Sadayappan. The 45rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'18), January, 2018.
Efficient Cache Simulation for Affine Computations. Wenlei Bao, Prashant Rawat, Martin Kong, Sriram Krishnamoorthy, Louis-Noël Pouchet, P. Sadayappan. The 30th International Workshop on Languages and Compilers for Parallel Computing(LCPC'17), October, 2017.
Static and Dynamic Frequency Scaling on Multicore CPUs. Wenlei Bao, Changwan Hong, Sriram Krishnamoorthy, C.D. Sudheer, Louis-Noël Pouchet, Fabrice Rastello and P. Sadayappan. The ACM Transactions on Architecture and Code Optimization (TACO'17), January, 2017. (Original work, invited to HiPEAC'17).
Effective padding of multidimensional arrays to avoid cache conflict misses. Changwan Hong, Wenlei Bao, Albert Cohen, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, J. Ramanujam and P. Sadayappan. The 37th annual ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI'16), June, 2016.
Polycheck: Dynamic verification of iteration space transformations on affine programs. Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, and P. Sadayappan. The 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'16), January, 2016.
PWCET: Power-Aware Worst Case Execution Time Analysis. Wenlei Bao, Sanket Tavarageri, Fusun Ozguner, and P. Sadayappan. Workshop at the 43rd International Conference on Parallel Processing (ICPP'14), September, 2014.

Professional Services

Reviewer of ACM Transactions on Architecture and Code Optimization (TACO).
Reviewer of Journal of Parallel and Distributed Computing (JPDC).
Reviewer of ACM Transactions on Embedded Computing Systems (TECS).
Reviewer of IEEE International Conference on High Performance Computing (HiPC'18).