About me
My name is Chi Zhang, and I am PhD in computer science at Brandeis University. My PhD advisor is Professor Olga Papaemmanouil. My PhD research topic is how machine learning can be used to solve database management problems like query scheduling, query optimization, workload management, and physical schema design. I am also currently working for Meta Platforms, Inc as a performance and capacity engineer. My Google Scholar profile contains further information.
Education
08/2018 - 01/2023, Brandeis University, Ph.D in Computer Science, Supervisor: Prof. Olga Papaemmanouil
08/2014 - 05/2016, Worcester Polytechnic Institute, Master of Science in Data Science
Publications
Making Data Clouds Smarter at Keebo: Automated Warehouse Optimization using Data Learning, Barzan Mozafari, Radu Burcuta, Alan Cabrera, Andrei Constantin, Derek Francis, David Grömling, Alekh Jindal, Maciej Konkolowicz, Valentin Marian-Spac, Yongjoo Park, Russell Razo Carranzo, Nicholas Richardson, Abhishek Roy, Aayushi Srivastava, Isha Tarte, Brian Westphal, Chi Zhang, In Proceedings of the 2023 International Conference on Management of Data (SIGMOD 2023)
Multi-agent Databases via Independent Learning, Chi Zhang, Olga Papaemmanouil, Josiah Hanna. (AIDB@VLDB 2022)
Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning, Chi Zhang, Ryan Marcus, Anat Kleinman, Olga Papaemmanouil. The 2nd International Workshop on Applied AI for Database Systems and Applications (AIDB@VLDB 2020)
NashDB: Fragmentation, Replication, and Provisioning using Economic Methods (Demo), Ryan Marcus, Chi Zhang, Shuai Yu, Geoffrey Kao, Olga Papaemmanouil. In Proceedings of 45th International Conference on Very Large Databases (VLDB 2019)
Neo: A Learned Query Optimizer, Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, Nesime Tatbul. In Proceedings of 45th International Conference on Very Large Databases (VLDB 2019)
Learning from Demonstration for Join Ordering Enumeration Techniques, Chi Zhang, Ryan Marcus, Olga Papaemmanouil, New England Database Summit, 2019 (NEDB 2019)
AdaptiveQueue: multi-query scheduling via deep reinforcement learning, Chi Zhang, Olga Papaemmanouil. In preparation
Patent
Candidate Projection Enumeration based on Query Response Generation, Chi Zhang, Zhibo Peng, Yuanzhe Bei, Olga Papaemmanouil. US Patent App, 16/717, 615
Professional Experience
Performance and Capacity Engineer, New York, New York 12/2023 - Current
Capacity Planning@Infrastructure
Research Scientist, Keebo Inc, Massachusetts 01/2023 - 11/2023
Designed and implemented machine learning algorithms for data warehouse optimization on cloud platform, focusing on Snowflake.
Developed validation mechanism of learning algorithms to enhance system performance, reduce costs, and accelerate query processing, leading to improved system reliability and user satisfaction.
Tested methodology with writing and execution of test plans, debugging, and testing scripts and tools.
Collaborated with cross-functional team, applying expertise in machine learning and database systems to drive innovation in warehouse optimization.
Technologies used: Python, Java, SQL, Machine Learning, Deep Reinforcement Learning, Airflow, GCP, Linux.
PhD Researcher, Brandeis University, Waltham, Massachusetts 08/2018 - 01/2023
Published 5 papers at top-tier database conferences/journals.
Led research projects on machine learning for query optimization, scheduling, workload management.
Collaborated with research team to identify relevant challenges and determine best methods of collection.
Developed research proposals, identified research objectives and created research plans.
Senior Data Scientist, Trip.com, Shanghai, China 10/2017 - 08/2018
Utilized advanced statistical and data mining techniques to identify and address existing or potential challenges in hotel operations.
Worked with stakeholders to develop quarterly roadmaps based on impact, effort and test coordinations.
Led data analytics for Trip.com’s overseas hotels, focusing on enhancing market share and competitiveness in areas like pricing, user experience and hotel coverage.
Engaged closely with cross-functional teams, providing them actionable insights through data analytic reports, contributing to better business decisions.
Data Scientist, Amazon, Santa Monica, California 06/2016-09/2017
Experience in utilizing data from start to finish.
Managed the complete data lifecycle, from raw data extraction and transformation (ETL) to data storage in platforms like S3, Postgres and Redshift.
Designed and built novel machine learning algorithms to predict and test performance for Ring doorbell devices, including battery longevity, security measures and user behaviors.
Developed intuitive visual dashboards using tools like d3 and matplotlib to monitor and display the status of Ring devices.
Collaboratively defined business questions and designed data mining processes to support various teams within Ring, such as product, sales, marketing, executive, and engineering.
Technologies used: Python, SQL, Machine Learning, Analytics, AWS, Airflow, Redshift, Tableau.
Presentations
The 48th International Conference on Very Large Databases, Sydney, Australia Aug. 2022
The 4th International Workshop on Applied AI for Database Systems and Applications, Sydney, Australia Aug. 2022
The 2nd International Workshop on Applied AI for Database Systems and Applications, Tokyo, Japan Aug. 2020
New England Database Summit. Massachusetts Institute of Technology, Cambridge, Massachusetts, USA Jan. 2020
The 45th International Conference on Very Large Data Bases, Los Angeles, California, USA Aug. 2019
The Vertica Research Talk, Cambridge, Massachusetts, USA Aug. 2019
New England Database Summit. Massachusetts Institute of Technology, Cambridge, Massachusetts, USA Jan. 2019
Services
Program Committee:
2023: KDD, ICDCS, IDEAS, BDS, ICCCBDA, SSDBM, PhD Workshop@ICDE, DEEM@SIGMOD, BigVis@EDBT, BDMS@DASFAA, EuroML Sys
2024: EuroML Sys
Journal Reviewer: IS, IEEE Access
Teaching Experience
COSI 12b: Advanced Programming Techniques in Java
COSI 105b: Software Engineering for Scalability
COSI 132b: Networked Information Systems
Contact
Email Address: chizhang at brandeis dot edu
Office Address:
Volen 110
Department of Computer Science
Brandeis University
415 South St, Waltham, MA 02453