home

Dr. Zhang Zhenjie

Senior Research Scientist of
Advanced Digital Sciences Center
Adjunct Professor of
Guangdong Univresity of Technology


Office: 1 Fusionopolis Way, #08-10 Connexis North Tower
Tel: +65-6591-9087
Email: zhenjie@adsc.com.sg




I received my B.S. from Department of Computer Science and Engineering , Fudan University in 2004. In 2010, I got my Ph.D. in Computer Science from School of Computing, National University of Singapore. My PhD thesis is about clustering and unsupervised learning on uncertain data, advised by Prof. Anthony K. H. Tung. During my PhD, I have also worked on different topics related to data analytics, including multi-criteria selection, high-dimensional indexing techniques. My current research interests are focused on distributed database, parallel streaming analytics, text mining and large-scale machine learning. Before joining ADSC in 2010, I have worked as research associate and research fellow in NUS. I was the recipient of NUS President's Graduate Fellowship in 2008 and best paper winner of IC2E 2013.




Projects

  1. Enabling medical research with differential privacythe project team includes biomedical researchers from the Genome Institute of Singapore and from NUHS/NUS, along with data mining and security experts from ADSC, I2R, and NTU.  The overall plan was for the biomedical researchers to identify the types of analyses where they most wanted to be able to obtain differentially private results, along with the accuracy needed for those results in order for them to be useful.  Then the computer scientists would devise a differentially private version of each identified type of analysis, and validate the quality of the results using data provided by the biomedical researchers.
  2. Scalable and Real-Time Analytics for Challenging Data: this project targets distributed stream processing systems, which are relatively easy to deploy, manage, and optimize on cloud platforms. The key driver for the specialization of this generic framework is the need for scalable, flexible, real-time response at reasonable cost. We design new technologies on all levels of the software stack to enable effective analytics on challenging data on the fly, including text, audio and video stream.
  3. LEO: Learning-based Efficiency Optimization for centralized air-conditioning system: this is an NRF-funded project, aiming at optimizing energy efficiency of HVAC systems, by machine learning on streaming sensor data and real-time reconfiguration on the chiller plants. This is a 2-year project starting in July 2016.


Recent and Representative Publications
    For complete publication list and citation statistics, please refer to DBLP and Google Scholar
  1. Hoang Dung Vu, Kok Soon Chai, Bryan Keating, Nurislam Tursynbek, Boyan Xu, Kaige Yang, Xiaoyan Yang, Zhenjie Zhang, “Data Driven Chiller Plant Energy Optimization with Domain Knowledge”, to appear in CIKM 2017.
  2. Ruichu Cai, Zhenjie Zhang, Zhifeng Hao, Marianne Winslett, "Sophisticated Merging over Random Partitions: A Scalable and Robust Causal Discovery Approach", to appear in IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
  3. Tom Z. J. Fu, Richard T. B. Ma, Marianne Winslett, Yin Yang, Zhenjie Zhang, "DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams", to appear in IEEE/ACM Transactions on Networking (ToN).
  4. Ruichu Cai, Zijie Lu, Li Wang, Zhenjie Zhang, Tom Z. J. Fu, Marianne Winslett, "DITIR: Distributed Index for High Throughput Trajectory Insertion and Real-time Temporal Range Query" (demo paper), to appear in PVLDB 2017.
  5. Junhua Fang, Rong Zhang, Tom Z.J.Fu, Zhenjie Zhang, Aoying Zhou, Junhua Zhu, "Parallel Stream Processing Against Workload Skewness and Variance", in HPDC 2017.
  6. Deokwoo Jung, Zhenjie Zhang, Marianne Winslett, " Vibration Analysis for IoT Enabled Predictive Maintenance", in ICDE 2017.
  7. Ning Wang, Xiaokui Xiao, Yin Yang, Zhenjie Zhang, Yu Gu, Ge Yu, "PrivSuper: a Superset-First Approach to Frequent Itemset Mining under Differential Privacy", in ICDE 2017.
  8. Zhida Chen, Gao Cong, Zhenjie Zhang, Tom Fu, Lisi Chen, "Distributed Publish/Subscribe Query Processing on the Spatio-Textual Data Stream", in ICDE 2017.
  9. Wenliang Chen, Zhenjie Zhang, Zhenhua Li, Min Zhang, "Distributed Representation for Building Profiles of Users and Items from Text Reviews", in COLING 2016.
  10. Junhua Fang, Rong Zhang, Xiaotong Wang, Tom Fu, Zhenjie Zhang, Aoying Zhou, "Cost-Effective Stream Join Algorithm on Cloud System", in CIKM 2016.
  11. Parijat Mazumdar, Li Wang, Marianne Winslett, Zhenjie Zhang, Deokwoo Jung, "An Index Scheme for Fast Data Stream to Distributed Append-Only Store", in WebDB 2016.
  12. Ganzhao Yuan, Yin Yang, Zhenjie Zhang, Zhifeng Hao, "Semidefinite Optimization for Linear Aggregate Query Processing under Approximate Differential Privacy", in SIGKDD 2016.
  13. Ruichu Cai, Zhenjie Zhang, Zhifeng Hao, Marianne Winslett, "Understanding Social Causalities Behind Human Action Sequences", to appear in IEEE Transaction on Neural Networks and Learning Systems.
  14. Ruichu Cai, Zhenjie Zhang, Srini Parthasarathy, Anthong K. H. Tung, Zhifeng Hao, Wen Zhang, "Multi-Domain Manifold Learning for Drug-Target Interaction Prediction", in SDM 2016.
  15. Jianbing Ding, Zhenjie Zhang, Richard T. B. Ma, Yin Yang, "Abacus: An Auction-Based Approach to Cloud Service Differentiation",  in Computer Network.
  16. Li Wang, Minqi Zhou, Zhenjie Zhang, Ming-Chien Shan, Yin Yang, Aoying Zhou, "Elastic Pipelining in An In-Memory Database Cluster", in SIGMOD 2016.
  17. Tom Fu, Jianbing Ding, Richard T. B. Ma, Marianne Winslett, Yin Yang, Zhenjie Zhang, Yong Pei, Bingbing Ni, "LiveTraj: Real-Time Trajectory Tracking over Live Video Streams", (demo paper), in ACM Multimedia 2015.
  18. Tom Fu, Jianbing Ding, Richard T. B. Ma, Marianne Winslett, Yin Yang, Zhenjie Zhang, "DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams", in ICDCS 2015.
  19. Ruichu Cai, Zhifeng Hao, Marianne Winslett, Xiaokui Xiao, Yin Yang, Zhenjie Zhang, Shuigen Zhou, "Deterministic Identification of Specific Individuals from GWAS Results", in Bioinformatics.
  20. Ganzhao Yuan, Zhenjie Zhang, Marianne Winslett, Xiaokui Xiao, Yin Yang, Zhifeng Hao, "Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy", in ACM TODS.
  21. Ling Gu, Minqi Zhou, Zhenjie Zhang, Ming-Chien Shan, Aoying Zhou, Marianne Winslett, "Chronos: An Elastic Parallel Framework for Stream Benchmark Generation and Simulation", in ICDE 2015. (System web page)
  22. Xianke Zhou, Sai Wu, Zhenjie Zhang, Gang Chen, Anthony K. H. Tung, Marianne Winslett, "PABIRS: A Data Access Middleware for Distributed File Systems", in ICDE 2015.
  23. Rong Zhang, Zhenjie Zhang, Xiaofeng He, Aoying Zhou, "Dish Comment Summarization Based on Bilateral Topic Analysis", in ICDE 2015.
  24. Tuan-Anh Nguyen Pham, Xutao Li, Gao Cong, Zhenjie Zhang, "A General Graph-based Model for Recommendation in Event-based Social Networks", in ICDE 2015.
  25. Li Wang, Minqi Zhou, Zhenjie Zhang, Ming-Chien Shan, Aoying Zhou, "NUMA-Aware Scalable and Efficient In-Memory Aggregation on Large Domains", in IEEE Transaction on Knowledge and Data Engineering (TKDE).
  26. Ruichu Cai, Zhenjie Zhang, Anthony K. H. Tung, Chenyun Dai, Zhifeng Hao. "A general framework of hierarchical clustering and its applications", in Information Science 272: 29-48 (2014).
  27. Xiaoli Wang, Xiaofeng Ding, Anthony K. H. Tung, Zhenjie Zhang, "Efficient and Effective KNN Sequence Search with Approximate N-Grams", in VLDB 2014.
  28. Jia Xu, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, Ge Yu, Marianne Winslett, "Differentially Private Histogram Publication", in VLDB Journal.
  29. Sai Wu, Sheng Wang, Xiaoli Wang, Zhenjie Zhang, Anthony K. H. Tung, "K-Anonymity for Crowdsourcing", in IEEE Transaction on Knowledge and Data Engineering (TKDE).
  30. Jun Zhang, Xiaokui Xiao, Yin Yang, Zhenjie Zhang, and Marianne Winslett, "PrivGene: Differentially Private Model Fitting Using Genetic Algorithms", in SIGMOD 2013.
  31. Ruichu Cai, Zhenjie Zhang, and Zhifeng Hao, "SADA: A General Framework to Support Robust Causation Discovery". in ICML 2013.
  32. Zhenjie Zhang, Richard T. B. Ma, Jianbing Ding, Yin Yang, "ABACUS: An Auction-Based Approach to Cloud Service Differentiation" [slides], in IC2E 2013. (best paper award winner)
  33. Zhenjie Zhang, Hu Shu, Zhihong Chong, Hua Lu, Yin Yang, "C-Cube: Elastic Continuous Clustering in Clouds", in ICDE 2013.
  34. Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett,  "Functional Mechanism: Regression Analysis under Differential Privacy". in VLDB 2012.
  35. Jia Xu, Zhenjie Zhang, Anthony K. H. Tung, and Ge Yu, "Efficient and Effective Similarity Search on Probabilistic Data based on Earth Mover's Distance". in VLDB Journal, [Codes & Data].
  36. Daniel Yang Li, Zhenjie Zhang, Yin Yang, and Marianne Winslett, "Compressive Mechanism: Utilizing Sparse Representation in Differential Privacy". in WPES 2011.
  37. Zhenjie Zhang, Marios Hadjieleftheriou, Beng Chin Ooi, and Divesh Srivastava, "B^{ed}-Tree: An All-Purpose Tree Index for String Similarity Search on Edit Distance". in SIGMOD 2010.
  38. Zhenjie Zhang, Beng Chin Ooi, Srinivasan Parthasarathy, and Anthony K.H. Tung. "Similarity Search on Bregman Divergence: Towards Non-Metric Indexing". in VLDB 2009.
  39. Zhenjie Zhang, Hua Lu, Beng Chin Ooi, and Anthony K.H. Tung. "Understanding the Meaning of A Shifted Sky: A General Framework on Extending Skyline Query". in International Journal of Very Large Database (VLDBJ).
  40. Zhenjie Zhang, Yin Yang, Ruichu Cai, Dimitris Papadias and Anthony K.H. Tung. "Kernel-Based Skyline Cardinality Estimation". in SIGMOD 2009.
  41. Zhenjie Zhang, Reynold Cheng, Dimitris Papadias and Anthony K.H. Tung. "Minimizing the Communication Cost for Continuous Skyline Maintenance". in SIGMOD 2009.
  42. Zhenjie Zhang, Laks Lakshmanan and Anthony K.H. Tung. "On Domination Game Analysis for Microeconomic Data Mining". in ACM Transaction on Knowledge Discovery from Data (TKDD).
  43. Zhenjie Zhang, Bing Tian Dai and Anthony K.H. Tung. "Estimating Local Optimums in EM Algorithm over Gaussian Mixture Model ". in ICML 2008. [Technical Report]
  44. Chee-Yong Chan, H.V. Jagadish, Kian-Lee Tan, Anthony K.H. Tung and Zhenjie Zhang. "Finding k-Dominant Skylines in High Dimensional Spaces". in SIGMOD 2006.
  45. Chee-Yong Chan, H.V. Jagadish, Kian-Lee Tan, Anthony K.H. Tung and Zhenjie Zhang. "On High Dimensional Skyline".in EDBT 2006.



Honours and Awards

  1. NUS President's Graduate Fellowship in 2007
  2. Dean's Award in 2008
  3. Best Paper Award of IEEE International Conference on Cloud Engineering (IC2E 2013)



Professional Activities

  1. Program Co-Chair of APWeb 2015
  2. Program Committee Member of ICDE 2015
  3. Program Committee Member of SIGKDD 2014
  4. Publicity Co-Chair of IC2E 2014
  5. Publication Co-Chair of WAIM 2014
  6. Program Committee Member of PVLDB 2014
  7. Program Committee Member of WPES 2013
  8. Program Committee Member of IEEE Big Data 2013
  9. Program Committee Member of SIGKDD 2013
  10. Publication Chair of MobiDE'13
  11. Program Committee Member of SIGMOD 2013
  12. Co-PC Chair PrivDB 2013, in conjunction with ICDE 2013
  13. Program Committee Member of ICDE 2013 (demo track)
  14. Program Committee Member of EDBT 2013
  15. Program Committee Member of VLDB 2012
  16. Program Committee Member of MDM 2012
  17. Program Committee Member of APWeb 2012
  18. Program Committee Member of ICDE 2012
  19. Program Committee Member of APWeb 2011
  20. Program Committee Member of SIGKDD 2010
  21. Program Committee Member of VLDB 2010
  22. Program Committee Member of WWW 2010 (poster track)
  23. External Conference Reviewer for KDD 2006, VLDB 2006, WWW 2007, ICDE 2008, CIKM 2008, ICDE 2009 et al.
  24. Journal Reviewers for VLDB Journal, IEEE TKDE, JCST et al.