Experience

Work Experience

    • Software Development Engineer, Amazon, New York, NY, April 2021 ~ Present

        • Worked on Annotation Hub, a platform to provide computer vision data for ML training

        • Worked on AnalyzeID project, allowing human task force to identify and label government IDs

    • Senior Staff Engineer, Xilinx Research Labs, San Jose, CA, April 2020 ~ April 2021

        • Led 3 engineers on Zynq RFSoC board development and bring-up

        • Ported Vitis AI ML models, software libraries, and hardware designs onto embedded platforms

        • Developed Pybind11 C++ binding magic allowing users to program C++ in Jupyter notebook

    • Staff Design Engineer, Xilinx Research Labs, San Jose, CA, September 2015 ~ April 2020

        • Served as the No. 1 contributor for PYNQ (Python Productivity for ZYNQ) open-source framework

        • Developed bash scripts in QEMU environment to build SD card image for embedded platform

        • Built Python CFFI/Python-C/Pybind11/SWIG bindings for C/C++ targeting multiple architectures

    • Summer Research Intern, Hitachi Global Storage Technology, San Jose, CA, May 2013 ~ August 2013

        • Implemented DDR3 memory controller using Verilog for DRAM and MRAM on FPGA

        • Verified the design using Xilinx ChipScope and detected MRAM reading/writing bit errors

    • Summer Intern, Shanghai Jiao Tong University, Shanghai, China, July 2008 ~ Aug 2008

        • Designed and optimized a wireless Frequency Modulated (FM) transmitter and an FM receiver

Research Projects

    • High-performance Packet / Traffic Classification on FPGA, May 2011 ~ present

        • Led a research team developing a 2-dimensional pipelined architecture using Verilog on FPGA

        • Utilized logic cells and LUT-based distributed RAM to construct modular processing elements

        • Achieved superior throughput (2-fold compared to prior works) while supporting dynamic updates

        • Estimated post-simulation power on Vivado for classification / lookup engines on Virtex 5, 6, and 7 FPGAs

        • Explored the design space with respect to various design parameters such as clock rate and resource utilization

    • Multi-flow Regular Expression Matching (REM), March 2011 ~ April 2012

        • Implemented REM engines processing up to 128 packet flows concurrently at 200 MHz clock rate

        • Parsed packet headers and Snort / Bro regular expression patterns in the backend

        • Generated VHDL/Verilog files for large RTL designs using automatic Perl scripts

        • Developed parameterizable designs and conducted design-space exploration by Tcl scripts

        • Optimized the RTL design using PlanAhead tool and register retiming to meet the clock constraints

Education

    • University of Southern California, Ph.D., Computer Engineering, 2011 ~ 2015

    • University of Southern California, Master, Electrical Engineering, 2009 ~ 2011

    • Shanghai Jiao Tong University, Bachelor, Electrical Engineering, 2005 ~ 2009

Activities

  • Proceedings chair for ANCS 2017

  • Web chair for ANCS 2014, 2015