Research Projects
Research Projects
VeloxML
NetsDB
An on-going project to deploy and serve machine learning models from relational database systems.
Funding
NSF CAREER Award of 547, 584 USD
IBM Global University Program Academic Award of 40,000 USD
Publications
[SoCC 2023] Hong Guan*, Saif Masood*, Mahidhar Dawrampudi*, Venkatesh Gunda*, Hong Min, Lei Yu, Soham Nag*, Jia Zou. A Comparison of End-to-End Decision Forest Inference Pipelines, 2023 ACM Symposium on Cloud Computing SoCC'23 [16 pages][To Appear]
[IEEE Big Data 2023] Ankita Sharma*, Xuanmao Li*, Hong Guan*, Guoxin Sun*, Liang Zhang, Lanjun Wang, Kesheng Wu, Lei Cao, Erkang Zhu, Alexander Sim, Teresa Wu, Jia Zou. Automatic Data Transformation Using Large Language Model: An Experimental Study on Building Energy Data. 2023 IEEE International Conference on Big Data. (Industrial and Government Track) Pre-Print CoRR abs/2309.01957 (2023) [PDF] [10 pages][To Appear] (The first four authors have equal contributions to the work)
[VLDB 2022] Lixi Zhou, Jiaqing Chen, Amitabh Das, Hong Min, Lei Yu, Ming Zhao, and Jia Zou. "Serving Deep Learning Models with Deduplication from Relational Databases." (To Appear in VLDB 2022). [14 pages][PDF]
Lixi Zhou, Arindam Jain, Zijie Wang, Amitabh Das, Yingzhen Yang, and Jia Zou, "Benchmark of DNN Model Search at Deployment Time." (To Appear in SSDBM 2022). [12 pages]
Powehi
An on-going project to provide automatic data and query optimization for distributed machine learning applications.
Funding
ASU start-up funding
Publications
[SSDBM 2023] Lixi Zhou*, Lei Yu, Jia Zou, Hong Min. Privacy-Preserving Redaction of Diagnosis Data through Source Code Analysis. In Proceedings of the 35th International Conference on Scientific and Statistical Database Management, SSDBM 2023 [4 pages] [PDF]
Jia Zou, Amitabh Das*, Pratik Barhate*, Arun Iyengar, Binhang Yuan, Dimitrije Jankov, and Chris Jermaine. "Lachesis: Automated Generation of Persistent Partitionings for UDF-Centric Analytics." arXiv:2006.16529 [cs.DB] (VLDB 2021) [14 pages]
Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, and and Chris Jermaine. “Tensor Relational Algebra for Machine Learning System Design.” arxiv:2009.00524 [cs.DB] (VLDB 2021) [13 pages]
Jia Zou, "Using Deep Learning Models to Replace Large Materialized Views in Relational Database", CIDR 2021 (Abstract) [1 page]
Zijie Wang*, Lixi Zhou*, Jia Zou. "Integration of Fast-Evolving Data Sources Using A Deep Learning Approach." SFDI 2020, workshop co-located with VLDB 2020 (14 pages)
Jia Zou, Ming Zhao, Juwei Shi and Chen Wang. "WATSON: A Workflow-based Data Storage Optimizer for Analytics." MSST 2020 (14 pages)
Past Research Projects
An open-sourced high performance and high productivity distributed object-oriented database for large-scale analytics on big data. This is my work in Rice University, supervised by Prof. Chris Jermaine.
Funding Source: DARPA MUSE program, award No. FA8750-14-2-0270
Publications:
1. Jia Zou, Arun Iyengar, Chris Jermaine. Architecture of a distributed storage that combines file system, memory and computation in a single layer. The VLDB Journal (2020). [25 pages]
2. Jia Zou, Arun Iyengar, Chris Jermaine. Pangea: Monolithic Distributed Storage for Data Analytics. 45th International Conference on Very Large Data Bases. (VLDB'19). [14 pages]
3. Jia Zou, R Matthew Barnett, Tania Lorido-Botran, Shangyu Luo, Carlos Monroy, Sourav Sikdar, Kia Teymourian, Binhang Yuan, Chris Jermaine. PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development. Proceedings of the 2018 International Conference on Management of Data. (SIGMOD'18) [16 pages]
Media Coverage:
EasyNoSQL (Apr 2011 ~ Sept 2015)
A MRTuner system to automatically tune dozens of Hadoop parameters for optimal performance, which can achieve more than 10x speed up for Hadoop;
A Schema discovery algorithm that automatically extracts data skeletons from schema-less NOSQL data stores.
Using open sourced NOSQL technology to analyze large-scale China Mobile's social network data.
Funding Source: IBM Research - China, China Mobile
Industrial Impact: Used in IBM customer pilots with China Merchant Bank and so on.
Publications:
Lanjun Wang, Oktie Hassanzadeh, Shuo Zhang, Juwei Shi, Limei Jiao, Jia Zou, Chen Wang. Schema Management for Document Stores. Proceedings of the 41th Interna- tional Conference on Very Large Data Bases. (VLDB'15) [12 pages]
Juwei Shi, Jia Zou, Jiaheng Lu, Zhao Cao, Shiqiang Li, Chen Wang. MRTuner: A Toolkit to Enable Holistic Optimization for MapReduce Jobs. Proceedings of the 40th International Conference on Very Large Data Bases. (VLDB'14) [12 pages]
Jia Zou, Juwei Shi, Tongping Liu, Zhao Cao, Chen Wang. Foreseer: Workload-aware Data Storage for MapReduce. Proceedings of the 35st International Conference on Distributed Computing Systems. (ICDCS'15) [2 pages]
Granted Patents:
with Juwei Shi, Chen Wang and et al. Method and apparatus for generating schema of non-relational database. US Patent 10002142B2, 2018
with Li Li, Juwei Shi and et al. Resource management in MapReduce architecture and architectural system. US Patent 9582334 B2, 2017
with Zhao Cao, Juwei Shi and et al. Scheduling and execution of tasks based on resource availability. US Patent 9495206 B2, 2016
with Kun Wang, Tianyi Wang and et al. Data processing method, data query method in a database, and corresponding device. US Patent 9471612 B2, 2016
with GuanCheng Chen, Juwei Shi and et al. Method and apparatus for processing database data in distributed database system. US Patent 9411867B2, 2016
with Xiaotao Chang, Fei Chen and et al. Method and system for allocating FPGA re- sources. US Patent US9389915 B2, 2016
with Heng Cao, Juwei Shi and et al. Determining location of a user of a mobile device. US Patent US9374800, B2, 2016
ZSignal (Apr 2009 ~ Sept 2010)
To automatically find insights in performance data generated on IBM System Z server to improve hardware and software synergy.
Funding Source: IBM System Z Research funding
Industrial Impact: Used internally for Z360 processor development
Publication:
Jia Zou, Jing Xiao, Rui Hou, Yanqi Wang. Frequent Instruction Sequential Pattern Mining in Hardware Sample Data. Proceedings of the 10th IEEE International Conference on Data Mining. (ICDM'10) [6 pages]
Granted Patent:
with Stephen Heisig, Yanqi Wang and et al. Computer system performance analysis. US Patent US8639697 B2, 2014
Multi-leg Stock Trading (Jul 2008 ~ Apr 2009)
To design and develop a support multi-leg orders like sell 200 shares of IBM stock at 140 AND buy 300 shares of google stock at 1000 in distributed stock exchange. A look-ahead algorithm is designed to make a trade-off between fairness, concurrency and isolation.
Funding Source: IBM System Z Product Funding
Publication:
Jia Zou, Gong Su, Arun Iyengar, Yu Yuan, Yi Ge. Design and Analysis of a Distributed Multi-leg Stock Trading System. Proceedings of the 31st International Conference on Distributed Computing Systems. (ICDCS'11) [12 pages]
Granted Patent:
with Arun Iyengar, Su Gong and et al. Methods and systems for highly available coordinated transaction processing. US Patent 9146944B2, 2015
with Arun Iyengar, Su Gong and et al. Systems and methods for multi-leg transaction processing. US Patent 8601479B2, 2013
High Performance SIP Server (Jul 2006 ~ Apr 2008)
Performance modeling and optimization of SIP server on various parallel architecture.
Funding Source: Tsinghua-IBM Collaboration Project
Industrial Impact: Used internally as reference for IBM Websphere software development
Publications:
Jia Zou, Zhiyong Liang, Yiqi Dai. Scalability Evaluation and Optimization of Multi-core SIP Proxy Server . Proceedings of the 37th International Conference on Parallel Processing. (ICPP' 08) [8 pages]
Jia Zou, Wei Xue, Zhiyong Liang, Yixin Zhao, Bo Yang, Ling Shao. SIP Parsing Offload: Design and Performance. Proceedings of the 50th IEEE Global Telecommunications Conference. (GLOBECOM’07 ) [6 pages]
Jia Zou, Yiqi Dai. Motivating and Modeling SIP Offload. Proceedings of the 16th International Conference on Computer Communications and Networks. (ICCCN’07)[6 pages]