3rd Workshop on Novel Data Management Ideas on Heterogeneous Hardware Architectures (NoDMC)
co-located to BTW 2025 (Bamberg, Germany, March 3 - 7, 2025)
co-located to BTW 2025 (Bamberg, Germany, March 3 - 7, 2025)
The objective of this one-day workshop is to explore the challenges and opportunities of data processing on existing and future heterogeneous hardware architectures. Today's processors are no longer mainly bound by the density and frequency of transistors, but also by their power and heat budgets. This scenario, often termed the power wall, forces hardware suppliers to design specialized devices optimized for specific computational tasks, resulting in a heterogeneous processor landscape. Consequently, software must explicitly adapt to ensure optimized performance and efficiency across varying hardware configurations.
Additionally, memory and storage have seen an unprecedented change as well: novel and already commercially available techniques have blurred the traditional mental picture of a memory-/storage hierarchy. Technologies like Non-Volatile RAM (NVRAM) challenge the long-standing memory hierarchy prevalent in system-level applications. Moreover, very large caches, High-Bandwidth-Memory (HBM), Non-Uniform Memory Access (NUMA), processing-in-memory (PIM), remote-memory designs, and extremely fast SSDs add to the heterogeneous portfolio of memory/storage techniques. Therefore, to meet the performance requirements of the modern information society, tomorrow's database systems must exploit and embrace this increased heterogeneity of processor and memory technologies.
The third edition of this workshop will serve as a forum for discussing ongoing challenges, recent advancements, and future directions in the field. Contrarily to the format of previous editions, we aim to elevate tutorials to a primary role to train and foster a community of researchers and industry practitioners working on data processing challenges on heterogeneous hardware systems. These tutorials are intended to offer practical introductions to key concepts, libraries, or frameworks instrumental in engineering and optimizing data-intensive systems on modern heterogeneous hardware, such as (but not limited to) benchmarking, profiling, data parallelism, security considerations, or fault tolerance. Tutorials can present a unique opportunity to enhance the visibility of valuable tools and methodologies that, while pivotal in published research, may not have been the focal point of the publications themselves. These tools and methodologies, crucial for the research community, can be showcased in-depth, providing attendees with practical skills and insights that are directly applicable to their work. Tutorials may utilize already published materials; however, presenters are expected to provide extensions or novel perspectives, such as innovative applications or deeper analyses. Each tutorial is expected to span 30 - 60 minutes. To improve the educational impact and engagement of each tutorial, we strongly encourage authors to incorporate interactive elements into their sessions. This can include live demonstrations or hands-on coding exercises, e.g., via docker containers or platforms like GitHub Codespaces. To apply for a tutorial, a description of 4 - 8 pages (LNI format) should be submitted including a projected time-frame.
Furthermore, the workshop is designed to offer a conducive environment for networking, encouraging future collaborations among participants. Especially in view of the fading SPP 2037 on Scalable Data Management for Future Hardware and the SPP 2377 on Disruptive Memory Technologies, we want to strengthen collaborations beyond individual SPP projects by connecting them with other researchers. To this end, the workshop also welcomes lightning talks that explore specialized or emerging hardware technologies, such as processing-in-memory devices or racetrack memory, which may still be nascent. These talks may also target applications that leverage cutting-edge hardware innovations. Each lightning talk is expected to span 15 - 20 minutes. Prospective presenters are required to submit a description of their talk, ranging from 6 to 10 pages (LNI format).
In addition to tutorials and lightning talks, the workshop seeks to empower young researchers by providing an opportunity to showcase their ongoing research related to heterogeneous hardware. Therefore, we are introducing a separate poster session designed to spotlight emerging findings and ideas within this dynamic field. To guarantee uniformity in presentation, a poster template will be provided to all participants. This session not only offers young researchers a platform to share their work but also enhances the potential for scholarly exchange and feedback among peers and experienced academics. Participants interested in the poster session are required to submit a brief description of their research, up to 4 pages (LNI format).
Each submission (tutorial, lightning talk, or poster) will be evaluated by at least three members of the program committee and a selection will be made on the basis of these reviews. Topics of the workshop include, but are not limited to:
Applications of modern hardware in data mining, data-intensive machine learning, query processing, sensor or stream processing, or non-traditional applications (e.g., graph processing)
Algorithms and data structures for efficient data processing on and across different (co-)processors or memory technologies
Exploitation of specialized ASICs or specialized memory technologies (e.g., PIM)
Efficient memory management, data placement, and data transfer strategies in heterogeneous systems
Energy efficiency in heterogeneous hardware environments
Programming models and hardware abstraction mechanisms for writing data-intensive algorithms on heterogeneous hardware
Query optimization, cost estimation, and operator placement strategies for heterogeneous hardware
Transaction processing in heterogeneous systems
This workshop is co-organized by the GI-Arbeitskreis Data Management on Modern Hardware.
Welcome
Keynote by Philippe Bonnet on Computational Storage (What is it Good For?)
Short Poster Presentations
Offset-Value Coding using SIMD Intrinsics - Schmeller, Florian (1); Rabl, Tilmann (1); Graefe, Goetz (2) (1: HPI, University of Potsdam, Germany; 2: Google, USA)
Dynamic Write-Mode Fragmentation for Non-Volatile Memory Simulation - Rau, Janina; Biebert, Daniel; Hakert, Christian; Chen, Jian-Jia (TU Dortmund University, Germany)
Embracing NVM: Optimizing B\textsuperscript{$\epsilon$}-Tree Structures and Data Compression in Storage Engines - Karim, Sajad (1); Wünsche, Fia (1); Broneske, David (2); Kuhn, Michael (1); Saake, Gunter (1) (1: Otto-von Guericke University Magdeburg, Germany; 2: German Centre for Higher Education Research and Science Studies, Hanover, Germany)
Tutorial: Understanding Application Performance on Modern Hardware: Profiling Foundations and Advanced Techniques - Mühlig, Jan; Kühn, Roland; Teubner, Jens (TU Dortmund University, Germany)
Tutorial: Unleashing the Intel Data Streaming Accelerator - Berthold, André; Schmidt, Lennart; Lehner, Wolfgang; Schirmeier, Horst (TU Dresden, Germany)
Tutorial: Programming Processing-in-Memory for Data Management - Sattler, Kai-Uwe; Jibril, Muhammad Attahir (TU Ilmenau, Germany)
Lightning Talk: Feasibility Analysis of Semi-Permanent Database Offloading to UPMEM Near-Memory Computing Modules - Friesel, Birte Kristina; Lütke Dreimann, Marcel; Spinczyk, Olaf (Universität Osnabrück, Germany)
Lightning Talk: Lazy DBMS Storage Design with Computational Storage - Baumstark, Alexander; Sattler, Kai-Uwe (TU Ilmenau, Germany)
Tutorial: SIMD for Everyone- A tutorial to TSL - Pietrzyk, Johannes; Krause, Alexander; Lehner, Wolfgang (TU Dresden, Germany)
Tutorial: Dreaming of Syscall-less I/O with io uring - Some Assembly Re-quired,Feaver Dreams and Nightmares included - Pestka, Constantin (1); Paradies, Marcus (2) (1: DLR - German Center for Aerospace,Germany; 2: LMU - Munich)
Closing
Abstract: The potential benefit of near-data processing has been understood for decades. This potential has been explored through multiple research efforts. The advent of computational storage standards was supposed to fulfill this potential and lead to industry adoption. This did not happen. Why? This is the main question I address in this talk. I start by defining computational storage from first principles. The goal is to survey the design space and to position the standards in that space. I then focus on the lessons we learnt from Delilah, a prototype computational storage platform, that implements a variant of the NVMe standard. Finally, I discuss how these lessons apply to current efforts on in- or near-memory processing and co-designed SSDs.
Bio: Philippe Bonnet is professor in the Computer Science department of the University of Copenhagen (DIKU). For more than a decade, Philippe has worked in the area of Data Management with Flash Devices. His group has focused on Open-Channel SSDs, FTL design and computational storage.
Constantin Pestka, Marcus Paradies: Dreaming of Syscall-less I/O with io_uring - Some Assembly Required, Feaver Dreams and Nightmares included
Jan Mühlig, Roland Kühn, Jens Teubner: Understanding Application Performance on Modern Hardware: Profiling Foundations and Advanced Techniques
Kai-Uwe Sattler, Muhammad Attahir Jibril: Programming Processing-in-Memory for Data Management
André Berthold, Lennart Schmidt, Wolfgang Lehner, Horst Schirmeier: Unleashing the Intel Data Streaming Accelerator
Johannes Pietrzyk, Alexander Krause, Wolfgang Lehner: SIMD for Everyone - A tutorial to TSL
Birte Kristina Friesel, Marcel Lütke Dreimann, Olaf Spinczyk: Feasibility Analysis of Semi-Permanent Database Offloading to UPMEM Near-Memory Computing Modules
Alexander Baumstark, Kai-Uwe Sattler: Lazy DBMS Storage Design with Computational Storage
Janina Rau, Daniel Biebert, Christian Hakert, Jian-Jia Chen: Dynamic Write-Mode Fragmentation for Non-Volatile Memory Simulation
Florian Schmeller, Tilmann Rabl, Goetz Graefe: Offset-Value Coding using SIMD Intrinsics
Sajad Karim, Fia Wünsche, David Broneske, Michael Kuhn, Gunter Saake: Embracing NVM: Optimizing Bepsilon-Tree Structures and Data Compression in Storage Engines
Paper Submission: November 29th, 2024 December 15th, 2024 (23:59 CEST)
Notification of acceptance: December 20th, 2024 January 10th, 2025
Camera-ready copies: January 15th, 2025 February 14th, 2025 (23:59 CEST)
Registration: January 15th, 2025
There will be a "Best-of-BTW" issue of the "Datenbank-Spektrum" in 2025 and the best paper (one or two) of this workshop will be invited to this special issue.
The contributions must correspond to the layout specifications of the conference proceedings (LNI style, see https://gi.de/service/publikationen/lni). The page limits (not including references) are:
Tutorials: 4 - 8 pages
Lightning Talks: 6 - 10 pages
Posters: 2 - 4 pages
Manuscripts should be submitted electronically as PDF files at https://www.conftool.net/btw2025/index.php.
For the poster presentations, we provide a PPT template. You are encouraged to customize the sections, alter the color scheme, add new sections, or design your poster using a comparable layout.
Philippe Bonnet (University of Copenhagen)
Jan Mühlig (TU Dortmund University, Germany)
Roland Kühn (TU Dortmund University, Germany)
David Broneske (German Centre of Higher Education and Science Studies, Germany)
Dirk Habich (Technische Universität Dresden, Germany)
David Broneske (German Centre of Higher Education and Science Studies)
Patrick Damme (TU Berlin)
Philipp Götze (SAP SE)
Juliana Hildebrandt (TU Dresden)
Roland Kühn (TU Dortmund University)
Jan Mühlig (TU Dortmund University)
Hannes Rauhe (Potsdam Institute for Climate Impact Research)
Annett Ungethüm (TU Hamburg)
Stefan Wildermann (Friedrich-Alexander Universität Erlangen-Nürnberg)
Steffen Zeuch (TU Berlin)