Building and optimizing large-scale ML feature engineering and data pipelines for advertising ranking systems
Consistently recognized for high engineering velocity, strong code review culture, and cross-functional collaboration
Designed and implemented an end-to-end pipeline to extract, embed, and featurize search keywords from URLs, converting unstructured web data into ML features for ad ranking models
Re-engineered offline inference pipelines to dramatically reduce computational cost and improve GPU utilization
Built a user-to-user and advertiser-to-advertiser similarity expansion system using embedding-based nearest-neighbor search to expand ML training datasets
Trained a RQVAE model for entity dedup and built an LLM-as-judge evaluation pipeline for assessing pairwise entity deduplication precision and recall
Baidu is the largest Chinese search engine and one of the largest AI and Internet companies in the world
Developing a diverse range of AI pipelines and applications across various domains utilizing Large Language Models (LLMs)
Working a diverse range of AI pipelines and applications across various domains utilizing LLMs
Developed an AI-powered advertising pipeline for marketing with AIGC-based ad creatives
Designed an automated trend/topic crawler and extractor for Newsjacking in live stream
Built an LLM-based recommendation system for product selection in a vector database using multiply agents
Got 2024 Baidu LLM Mastermind award
Oversaw and worked on a breadth of projects as a machine learning engineer
Designed and built an AI-assistant system to generate dialogues and tell data stories using GPT
Built pipelines and dashboards to analyze behavioral data collected in virtual training environments using LLMs
Developed a bunch of 360/dialogue descriptive metrics in VR experiences
Implemented novel telemetric algorithms and machine learning pipeline in immersive analytics
Designed and created Strivr KPI schema and visualization tools for the executive leadership team
Worked with customers to design and execute evaluation plans, built reporting pipeline, as well as helped guide the analytic strategy throughout products
Provided the travel and tourism industry with greater visibility into the needs and wants of in-market travel consumers
Trained machine learning models for online advertising (conversion/click predictions)
Developed a system to monitor the impact of Adara advertisements
Built an incremental model which identifies a spectrum of prospects who are likely to make a purchase only if promoted by a marketing activity
Modeled and measured tourism (destination marketing organization) advertising effectively
Performed A/B testing and statistical analysis
Collaborated with product managers to define and implement key metrics in analytic products
Analyzed marketing campaign performance and delivery
Won best project award in ADARA Summer 2019 Hackathon as a team leader
ECE312 – Processors: Hardware, Software, and Interfacing
ECE417 – Embedded Microprocessor System Design
ECE567 – Database Design & Management
ECE577 – Data Mining
Worked with a cluster of 20+ Unix servers to process large scale data for insurance companies
Developed functions in PostgreSQL database to store and manipulate large insurance datasets
Designed and developed VBA and command line tools for efficient and automatic processing
Wrote, reviewed and verified the reports and documents to other teams and clients
Multimedia Computing and Network
Image Analysis and Computer Vision
Participated in the foundation of the Information Technology Center
Engaged in the establishment of “Portal of Chinese Science and Technology Resource”
Designed and implemented the web log management system
Participated in the design and implementation of Local Area Network based on the TCP/IP