The fourth edition of the Data Systems meet Data Science (DSDS) workshop, organized by the CREATE SustainSys: Sutainable Data Systems for Data Science program, will be held as part of the ICDE 2026 Conference in Montreal, Canada.
The workshop series brings together the research community that works at the intersection of data/software systems, software engineering, and Data Science/AI/ML, either by building the next generation Data Science platforms or by using AI/ML techniques to improve systems.
The workshop is co-organized by:
Bettina Kemme, Essam Mansour, Hans-Arno Jacobsen, Natalie Enright Jerger, Oana Balmau, Semih Salihoğlu
Where: Montreal, QC
When: Monday, May 4 2026
Registration: via the ICDE 2026 registration
Professor,Department of Computer Science
ETH Zurich (ETHZ)
Systems Group, (Institute for Computing Platforms)
Distinguished Professor
of Computer Science
Lyon 1 University
France
Professor
of Computer Science
Technische Universitat Berlin
Germany
Senior Research Scientist
Intel Labs & MIT
United States of America
Associate Professor
of Computer Science
University of Illinois at Chicago
United States of America
Assistant Professor
of Computer Engineering
Polytechnique Montreal
Canada
Seniour Lecturer
of Computer Science
University of Fribourg
Switzerland
08:00 - 09:00 Registration
08:45 - 09:00 Welcome
Session 1
09:00 - 09:45 Gustavo Alonso Sustainable Data Systems - Details
09:45 - 10:00 Boris Glavic Just-in-Time Model Replacement: Transparently Substituting LLMs with Cheaper Alternatives - Details
10:00 - 10:15 Amine Mhedhbi Semantic Query Processing over Relations - Details
10:15 - 10:30 Discussion
_______________________________________________________________________
10:30 - 11:00 Coffee Break
_______________________________________________________________________
Session 2
11:00 - 11:20 Angela Bonifati Property Graph Transformations in Action: From Data Integration to Causal Analysis - Details
11:20 - 11:35 Ziawasch Abedjan Ad-hoc Data Integration in Data Lakes - Details
11:35 - 11:50 Mourad Khayati Towards Multimodal Pipelines for Time Series Cleaning - Details
Session 2 - Student Talks
11:50 - 12:00 Abrar Fuad PostLearn: Towards a Learned Index For PostgreSQL - Details
12:00 - 12:10 Shubham Vashisth Making Learned Indexes LSM-compatible
12:10 - 12:20 Shiquan Zhang REMOP: REmote-Memory-aware OPerator Optimization
12:20 - 12:30 Discussion
_______________________________________________________________________
12:30 - 14:00 Lunch Break
_______________________________________________________________________
Session 3
14:00 - 14:20 Alkis Simitsis Efficient Execution of UDF Queries in Modern Data Engines - Details
14:20 - 14:35 Niv Dayan Sustainable Frequency Estimation - Details
14:35 - 14:50 Nesime Tatbul Towards Enterprise-Grade Text-to-SQL - Details
Session 3 - Student Talks
14:50 – 15:00 Yunhao Mao Epoch-based Optimistic Concurrency Control in Geo-replicated Databases
15:00 – 15:10 Yiran Li Quantum Optimization for Sustainable Data Management
15:10 -- 15:20 Waleed Afandi An LLM-Guided Query-Aware Inference System for GNN Models on Large Knowledge Graphs - Details
15:20 – 15:30 Michalis Bachras Beyond Performance: Measuring the Environmental Impact of Analytical Databases.
15:30 -- 15:40 Mohaddeseh Yaghoubpour Toward Resource-Aware Evaluation of Machine Learning Methods
15:40 - 16:00 Discussion & Closing Remarks
We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC).