Research Projects

Research Projects


VeloxML


NetsDB

An on-going project to deploy and serve machine learning models from relational database systems.

Funding

Publications

[SoCC 2023] Hong Guan*, Saif Masood*, Mahidhar Dawrampudi*, Venkatesh Gunda*, Hong Min, Lei Yu, Soham Nag*, Jia Zou.  A Comparison of End-to-End Decision Forest Inference Pipelines,  2023 ACM Symposium on Cloud Computing SoCC'23 [16 pages][To Appear]

[IEEE Big Data 2023] Ankita Sharma*,  Xuanmao Li*,  Hong Guan*,  Guoxin Sun*, Liang Zhang, Lanjun Wang, Kesheng Wu, Lei Cao, Erkang Zhu, Alexander Sim, Teresa Wu, Jia Zou. Automatic Data Transformation Using Large Language Model: An Experimental Study on Building Energy Data. 2023 IEEE International Conference on Big Data. (Industrial and Government Track) Pre-Print CoRR abs/2309.01957 (2023) [PDF] [10 pages][To Appear] (The first four authors have equal contributions to the work)

[VLDB 2022] Lixi Zhou, Jiaqing Chen, Amitabh Das, Hong Min, Lei Yu, Ming Zhao, and Jia Zou. "Serving Deep Learning Models with Deduplication from Relational Databases." (To Appear in VLDB 2022). [14 pages][PDF]

Lixi Zhou, Arindam Jain, Zijie Wang, Amitabh Das,  Yingzhen Yang, and Jia Zou, "Benchmark of DNN Model Search at Deployment Time." (To Appear in SSDBM 2022). [12 pages]

Powehi

An on-going project to provide automatic data and query optimization for distributed machine learning applications.

Funding

ASU start-up funding 

Publications

[SSDBM 2023] Lixi Zhou*, Lei Yu, Jia Zou, Hong Min. Privacy-Preserving Redaction of Diagnosis Data through Source Code Analysis. In Proceedings of the 35th International Conference on Scientific and Statistical Database Management, SSDBM 2023 [4 pages] [PDF]

Jia Zou, Amitabh Das*, Pratik Barhate*, Arun Iyengar, Binhang Yuan, Dimitrije Jankov, and Chris Jermaine. "Lachesis: Automated Generation of Persistent Partitionings for UDF-Centric Analytics."  arXiv:2006.16529 [cs.DB]  (VLDB 2021) [14 pages]

Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, and and Chris Jermaine. “Tensor Relational Algebra for Machine Learning System Design.” arxiv:2009.00524 [cs.DB]  (VLDB 2021) [13 pages]

Jia Zou, "Using Deep Learning Models to Replace Large Materialized Views in Relational Database", CIDR 2021 (Abstract) [1 page]

Zijie Wang*, Lixi Zhou*, Jia Zou. "Integration of Fast-Evolving Data Sources Using A Deep Learning Approach." SFDI 2020, workshop co-located with VLDB 2020 (14 pages)

Jia Zou, Ming Zhao, Juwei Shi and Chen Wang. "WATSON: A Workflow-based Data Storage Optimizer for Analytics." MSST 2020 (14 pages)

Past Research Projects

PlinyCompute (Sep 2015 ~ May 2019)

An open-sourced high performance and high productivity distributed object-oriented database for large-scale analytics on big data. This is my work in Rice University, supervised by Prof. Chris Jermaine.

Funding Source:  DARPA MUSE program, award No. FA8750-14-2-0270

Publications: 

  1. Jia Zou, Arun Iyengar, Chris Jermaine. Architecture of a distributed storage that combines file system, memory and computation in a single layer. The VLDB Journal (2020). [25 pages]

  2. Jia Zou, Arun Iyengar, Chris Jermaine. Pangea: Monolithic Distributed Storage for Data Analytics.  45th International Conference on Very Large Data Bases. (VLDB'19). [14 pages] 

 3. Jia Zou, R Matthew Barnett, Tania Lorido-Botran, Shangyu Luo, Carlos Monroy, Sourav Sikdar, Kia Teymourian, Binhang Yuan, Chris Jermaine.  PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development. Proceedings of the 2018 International Conference on Management of Data. (SIGMOD'18)  [16 pages] 

Media Coverage:


EasyNoSQL (Apr 2011 ~ Sept 2015)

Funding Source: IBM Research - China,  China Mobile

Industrial Impact: Used in IBM customer pilots with China Merchant Bank and so on. 

Publications: 

Granted Patents:


ZSignal (Apr 2009 ~ Sept 2010)

To automatically find insights in performance data generated on IBM System Z server to improve hardware and software synergy.

Funding Source: IBM System Z Research funding

Industrial Impact: Used internally for Z360 processor development 

Publication:

Jia Zou, Jing Xiao, Rui Hou, Yanqi Wang.  Frequent Instruction Sequential Pattern Mining in Hardware Sample Data. Proceedings of the 10th IEEE International Conference on Data Mining. (ICDM'10) [6 pages] 

Granted Patent:

with Stephen Heisig, Yanqi Wang and et al. Computer system performance analysis. US Patent US8639697 B2, 2014 


Multi-leg Stock Trading (Jul 2008 ~ Apr 2009)

To design and develop a support multi-leg orders like sell 200 shares of IBM stock at 140 AND buy 300 shares of google stock at 1000 in distributed stock exchange. A look-ahead algorithm is designed to make a trade-off between fairness, concurrency and isolation. 

Funding Source: IBM System Z Product Funding 

Publication:

Jia Zou, Gong Su, Arun Iyengar, Yu Yuan, Yi Ge. Design and Analysis of a Distributed Multi-leg Stock Trading System.  Proceedings of the 31st International Conference on Distributed Computing Systems. (ICDCS'11) [12 pages] 

Granted Patent: 

with Arun Iyengar, Su Gong and et al. Methods and systems for highly available coordinated transaction processing. US Patent 9146944B2, 2015

with Arun Iyengar, Su Gong and et al. Systems and methods for multi-leg transaction processing. US Patent 8601479B2, 2013 


High Performance SIP Server (Jul 2006 ~ Apr 2008)

Performance modeling and optimization of SIP server on various parallel architecture. 

Funding Source: Tsinghua-IBM Collaboration Project

Industrial Impact: Used internally as reference for IBM Websphere software development 

Publications: