Program

08:30--08:35 Greetings

Antonio Barbalace, Mark Silberstein and Zsolt Istvan welcome you to the workshop!

Title: Computing Using Time

Abstract: The traditional logic abstractions and digital design patterns we understand so well have co-evolved with the hardware technology that has embodied them and the environments that have hosted them. As we look to the future, there is no reason to think that those same abstractions and patterns best exploit the potential of novel devices or take on new applications. In this talk, we will reimagine the traditional digital/analog boundary and make a case for temporal computing, where time is neither a measure of quality nor a functional property but instead a computational resource. Towards this end, we will first introduce a computational temporal logic that establishes the foundation for this new paradigm. Then, we will demonstrate how this foundation opens up unique ways in which we can work with sensors and design machine learning systems. Finally, we will describe how temporal operators provide answers to several long-lasting problems in pulse-based computing -- specifically, superconducting.

H. Lefeuvre, V. Bădoiu, A. Jung, S. Teodorescu, S. Rauch, F. Huici, C. Raiciu, P. Olivier

Abstract: At design time, modern operating systems are locked in a specific safety and isolation strategy that mixes one or more hardware/software protection mechanisms (e.g. user/kernel separation); revisiting these choices after deployment requires a major refactoring effort. This rigid approach shows its limits given the wide variety of modern applications’ safety/performance requirements, when new hardware isolation mechanisms are rolled out, or when existing ones break.

In this extended abstract, we present FlexOS, a novel OS allowing users to easily specialize the safety and isolation strategy of an OS at compilation/deployment time instead of design time. This modular LibOS is composed of fine-grained components that can be isolated via a range of hardware protection mechanisms with various data sharing strategies and additional software hardening. The OS ships with an exploration technique helping the user navigate the vast safety/performance design space it unlocks. FlexOS has been previously published at ASPLOS'22.

B. Teguia, M. Karaoui, B. Batchakui, A. Tchana

Abstract: GiantVM is the state of the art distributed hypervisor (VEE 2020) which is based on KVM. It allows to start a guest OS which CPU and memory is provided by several physical machines. In this paper, we make two contributions. First, we study the origin of performance overhead of GiantVM, then we propose several optimizations which allows to improve it by about 39%. Second, we propose a hypervisor independent and flexible (easy to evolve) distributed shared memory (DSM) design for distributed virtual machines, contrary to GiantVM. We show an instantiation of this design for KVM, which is a popular virtualization system.

S. Gupta, A. Bhattacharyya, Y. Oh, A. Bhattacharjee, B. Falsafi, M. Payer

Abstract: Virtual Memory (VM) is a critical programming abstraction, but its implementation overheads are becoming a performance bottleneck with the increasing cache hierarchy and memory capacity. This paper introduces Midgard as an intermediate address space between the virtual and physical address spaces. The Midgard address space contains a mapping of the deduplicated Virtual Memory Areas (VMAs) across all virtual address spaces, thus enabling it to be used as a namespace for the cache hierarchy without any synonym/homonym problems. The virtual-to-Midgard address translations can be efficiently done at a VMA granularity with little caching support as there are only ~10 frequently used VMAs. Furthermore, the Midgard-to-physical address translations are only required if a cache block is not present in the cache hierarchy and has to be fetched from physical memory, thus mandating rare invocations for large-capacity cache hierarchies. Overall, Midgard future-proofs the VM implementations by enabling the VM overheads to scale with the cache hierarchy capacity.


10:10--10:40 BREAK


Talk title: Accelerating Genome Analysis

Abstract:

Genome analysis is the foundation of many scientific and medical discoveries as well as a key pillar of personalized medicine. Any analysis of a genome fundamentally starts with the reconstruction of the genome from its sequenced fragments. This process is called read mapping. One key goal of read mapping is to find the variations and similarities that are present between the sequenced genome and reference genome(s) and to tolerate the errors introduced by the genome sequencing process. Read mapping is currently a major bottleneck in the entire genome analysis pipeline because state-of-the-art genome sequencing technologies are able to sequence a genome much faster than the computational techniques that are employed to reconstruct the genome. New sequencing technologies, like nanopore sequencing, greatly exacerbate this problem while at the same time making genome sequencing much less costly.


This talk describes our ongoing journey in greatly improving the performance of genome read mapping as well as broader genome analysis. We first provide a brief background on read mappers that can comprehensively find genomic variations/similarities and tolerate sequencing errors. Then, we describe both algorithmic and hardware-based acceleration approaches. Algorithmic approaches exploit the structure of the genome, the structure of the problem at hand, as well as the structure of the underlying hardware. Hardware-based acceleration approaches exploit specialized microarchitectures or new execution paradigms like processing in memory. We show that significant improvements are possible with both algorithmic and hardware-based approaches and their combination. We conclude with a foreshadowing of future challenges brought about by very low-cost new sequencing technologies and their potential use cases in public health, science, and medicine.

D. Mvondo, A. Barbalace, J. Lozi, G. Muller

Abstract: In this position paper we argue that it is time for OS kernel-level schedulers to be user-programmable, from at least a category of users, without any security related side effects. We introduce our preliminary design that borrows the microkernels’ design principle of dividing mechanisms from policies, and applies that to monolithic OSes. All scheduling related mechanisms are always built-in in the OS kernel, while scheduling policies are modifiable at runtime by users’ applications (with specific privileges).

M. Brunella, M. Bonola, A. Trivedi

Abstract: Since the inception of computing, we have been reliant on CPU-powered architectures. However, today this reliance is challenged by manufacturing limitations (CMOS scaling), performance expectations (stalled clocks, Turing tax), and security concerns (microarchitectural attacks). To re-imagine our computing architecture, in this work we take a more radical but pragmatic approach and propose to integrate three primary pillars of computing, i.e., networking, storage, and computing, into a single CPU-free Data Processing Unit (DPU) called Hyperion. We present our vision to make the Hyperion DPU self-sufficient and self-hosting, hence not needing to attach it to any host server, thus making it a genuinely CPU-free DPU. We present our initial work-in-progress details and seek feedback from the SPMA community.

T. Miemietz, M. Planeta, V. Reusch, J. Bierbaum, M. Roitzsch, H. Härtig

Abstract: We are approaching a world, where the CPU merely orchestrates a plethora of specialized devices such as accelerators, RDMA NICs, or non-volatile memory (NVM). Such devices operate by mapping their internal memory directly into an application’s address space for fast, low-latency access. With the latency of modern I/O devices low enough to make traditional system calls a performance bottleneck, kernel interaction has no place on the data path of microsecond-scale systems. However, kernel bypass prevents the OS from controlling and supervising access to the hardware.

This paper tries to make a step back by bringing the OS to the critical path again, but with a reduced performancy penalty. We pick up on previous ideas for reducing the cost of kernel interaction and propose the fastcall space, a new layer in the traditional OS architecture that hosts specialized and quickly accessible OS functions called fastcalls. Fastcalls can stay on the critical path of a microsecond-scale application, because the transition to fastcall space is up to 15 times faster than to kernel space. We present and evaluate a prototype implementation of the fastcall framework and thereby show how much the overhead of calling into privileged mode can be reduced while using standard CPU features.

C. FONYUY-ASHERI, D. Mvondo, F. ONANINA, A. Tchana

Abstract: Cloud computing backbone is virtualization which offers several advantages such as the capability to live migrate virtual machines (VM). VM live migration is a crucial operation for tasks such as server maintenance and consolidation.

To date, migration is only possible between homogeneous (in terms of CPU micro-architecture) servers. This limitation notably lengthens maintenance window and minimizes consolidation gains. In this paper, we introduce TGVM, a project which aims at tackling VM live migration between heterogeneous CPUs having the same ISA.