System
High-Performance Computing (HPC)
Security
Cloud Computing
Artificial Intelligence
Network
Web Privacy
07:00pm Welcome reception
07:30am-08:30am Breakfast
Session 1 - Network
Session chair : Romain Rouvoy
08:45am-10:00am Keynote: Daniel Hagimont
Distributed TCP sessions
Abstract: Resource management and saving is an important goal for datacenter operators and has been the subject of many research works. Most of these works focused on CPU and memory management (e.g. server consolidation in virtualized datacenters), but very few addressed the issue of IO management and saving. Our objective is to improve resource management from the point of view of IOs. We consider online services hosted in datacenters. For modularity and scalability reasons, most online services deployed in datacenters rely on a multi-tier architecture where several software components are deployed on different machines. Interactions between tiers follow the classical client-server model. Examples are clusterized Web sites (HTTP) or mail hosting platforms (IMAP). In such applications, a client (external to the datacenter) can connect with TCP to a frontal tier (the entry point) and submit a request, triggering a distributed execution which flows between tiers. We observe that in such services, a significant amount of data returned to the client are emitted by tiers in this architecture, without any handling by other tiers that lie between the emitting tier and the frontal tier (which hosts the connection with the client). The tiers which precede the emitting tier only forward such data until the frontal tier which returns the data to the client. Therefore, network communications used to transfer such data through the multi-tier architecture until the frontal tier are needless and a significant source of waste in the data center (in terms of CPU, memory and IOs). We propose the introduction of a shortcut mechanism, which is equivalent to making the TCP connection with the client remotely accessible from any tier in the architecture. Two problems make it hard to implement such shortcuts. First, emitting tiers have to be coordinated in order to enforce the TCP internal consistency (sequence numbers). Second, shortcuts compromise the internal consistency of applications (as they don't received data anymore as expected) which have to be modified accordingly. We show that modifications to applications can be minimal and that it is even possible to implement shortcuts without any modification to applications. Finally, we conducted a performance evaluation that shows that shortcuts allow significant resource savings in the data center, without degrading communication performance in terms of latency and throughput.
Daniel Hagimont is a Professor at Polytechnic National Institute of Toulouse, France and a member of the IRIT laboratory, where he leads a group working on operating systems, distributed systems and middleware. He received a PhD from Polytechnic National Institute of Grenoble, France in 1993. After a postdoc at the University of British Columbia, Vancouver, Canada in 1994, he joined INRIA Grenoble in 1995. He took his position of Professor in Toulouse in 2005.
10:00am - 10:30am: break
10:30am - 12:30am: student presentations
BOUTALBI Samia: Mobile Edge Slice Broker: Mobile Edge Slices Deployment in Multi-Cloud Environments
KP Govind: Towards Energy-Efficient Stream Processing
GUGLIELMINO Mathieu: Interactive Design of Multilayer Network Topologies
12:30am - 04:30pm: Lunch & free time
04:30pm - 05:00pm: Coffee break
Session 2 - High-Performance Computing
Session chair : Sophie Cerf
05:00pm - 06:15pm Keynote: Olivier Beaumont
Memory saving techniques when training Deep Neural Networks
Abstract: Training in DNNs has now become a very important source of HPC resource usage. In this talk, I will mainly focus on issues related to memory consumption optimization and during training (not inference). We will first consider the case of a single GPU using techniques such as re-materialization or offloading (of weights and activations). We will then consider the parallel context and study how (data, model, tensor) parallelism(s) and memory usage interact with each other.
Olivier Beaumont holds a Senior Scientist position at Inria in the Topal team (Inria Bordeaux). He defended his PhD thesis in Rennes in 1999 and his Habilitation à diriger des recherches in Bordeaux in 2004. He is the author of 25 journal papers and 79 papers in International conferences. Currently, he acts as Associate Editor in Chief of JPDC and he acted as track chair of the main HPC conferences (SC, IPDPS, ICPP, HIPC). His main research interests are combinatorial optimization problems arising in HPC (load balancing, scheduling, data distribution), with applications in Linear Algebra and Training of Deep Neural Networks.
06:15pm - 06:30pm: break
06:30pm - 07:30pm: student presentations
NGUYEN Hoang Minh: High-accuracy computation of rolling friction contact problems
ARSALANE Khaled: Stateful and Distributed Data Stream Processing
GNIBGA Wedan Emmanuel: Renewable Energy in Data Centers: the Dilemma of Electrical Grid Dependency and Autonomy Costs
08:00pm Dinner
07:30am-08:30am Breakfast
Session 3 - Cloud Computing
Session chair : Stéphane Delbruel
08:45am-10:00am Keynote: Adrien Lebre
Utility Computing, 60 years already and still challenges to solve
Abstract: If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility... The computer utility could become the basis of a new and important industry”, John McCarthy, Speaking at the MIT centennial in 1961. Because it enables Utility Computing providers to go beyond the capacity of a single machine, distributed computing became the prevalent infrastructure for delivering IT as a service in the middle of the 1990’s. However, due to the complexity to operate and use distributed computing platforms, it has taken almost three decades of research activities and technology transfers to popularize them in all domains of science and industry. In this lecture, I will first give an overview of the different generations of the Utility Computing paradigm, starting from the network of workstations model to the cloud computing one. For each type of architecture, I will illustrate what the main challenges have been and how the distributed computing community has addressed them. In the second part, I will focus on the intrinsic limits of cloud computing solutions as well as on the requirements of the new usages which lead to move from large data centers to much more and much smaller computing units deployed at the edge of the network. If this new model, entitled Edge Computing, offers many opportunities, it also poses challenges for our community, where latency, throughput, and network partitions become key criteria.
10:00am - 10:30am: break
10:30am - 12:30am: student presentations
HUANG Chih-kai: Acala: Aggregate Monitoring for Geo-Distributed Cluster Federations
CHEN Jiavi: Distributed consensus performance evaluation on mobile nodes
JACQUET Pierre: Improve cloud resource utilization with dynamic oversubscription
12:30am - 04:30pm: Lunch & free time
04:30pm - 05:00pm: Coffee break
Session 4 - AI
Session chair: Mohamed Maouche
05:00pm - 06:15pm Keynote: Prasenjit Mitra
NLP & Privacy
Abstract: In this talk, I will introduce the state-of-the-art in natural language processing, especially deep learning models and how to preserve privacy of our text in various domains and applications especially in this era of GDPR and other regulatory requirements. Thereafter I will discuss open problems in the field ranging from theoretical issues to practical technological factors that need addressing. Using a few case studies and applications, including some of my own research, we will discuss issues related to data traceability, computational overhead, issues related to scalability and large datasets, and human biases in embeddings along with the privacy-utility tradeoff.
Specifically, I will discuss data safeguarding methods, issues related to trust, verification methods, privacy threats and metrics for privacy evaluation in applications using NLP technologies. We will also look at privacy policy documents and human computer interaction issues involved in making them more accessible and readable to end-users. This will generally be a survey-style talk with an aim towards motivating some of the most important problems in the field and providing an overview of what is done and what can be done with an eye towards spurring collaboration among the workshop attendees on (at least what I think) are the most important and impactful open problems in the field of privacy and natural language processing.
Prasenjit Mitra is a Professor at The Pennsylvania State University and a visiting Professor at the L3S Center at the Leibniz University at Hannover, Germany. He obtained his Ph.D. from Stanford University in 2003 in Electrical Engineering and has been at Penn State since. His research interests are in artificial intelligence, applied machine learning, natural language processing, etc. His research has been supported by the NSF CAREER award, the DoE, DoD, Microsoft Research, Raytheon, Lockheed Martin, Dow Chemicals, McDonnell Foundation, etc. His has published over 200 peer-reviewed papers at top conferences and journals, supervised or co-supervised 15-20 Ph.D. dissertations; his work has been widely cited (h-index 60) and over 12,500 citations. Along with his co-authors, he has won the test of time award at the IEEE VIS and a best paper award at ISCRAM, etc.
06:15pm - 07:15pm: student presentations
NGO Duc Thinh: Graph Neural Networks for Digital Twins
BONNEAU Antoine: Ultra Low Power Ambient Artificial Intelligence
NAAMA Saloua: Ironing the graphs: toward geometric analysis of large scale graphs
07:30pm Dinner
07:30am-08:30am Breakfast
Session 5 - Security
Session chair: Julien Sopena
08:45am-10:00am Keynote: Maria Mendez Real
Side-Channel Attacks in IoT devices
Abstract: This presentation will provide an introduction to the domain of security relating to side-channel attacks in embedded systems in the context of the Internet of Things. With the increasing number of heterogeneous devices communicating and processing more and more information through the network, the security of the IoT devices is a key challenge. Devices are exposed to attacks performed from software (i.e., malware) able to steal secret information and compromise the IoT ecosystem. Attack vectors include the use of shared resources such as processor memory, the power management, and (embedded) sensors information.
Maria Méndez Real is an Associate Professor at Polytech Nantes Université within the IETR lab. She received her PhD from Université de Bretagne-Sud within Lab-STICC. Her work is on security at the levels of hardware, and at the interface between hardware and software. Her research focuses on practical security aspects of embedded systems, covert and physical side-channel attacks.
10:00am - 10:30am: break
10:30am - 12:30am: student presentations
KHELILI Adrian: FiLiP: A File Lifecycle-based Profiler for hierarchical storage
WANG Yifan: Autonomous Database Management System
SECK Abdou: Transfer of large volumes of data between data centres
12:30am - 04:30pm: Lunch & free time
04:30pm - 05:00pm: Coffee break
Session 6 - System
Session chair : Gil Utard
05:00pm - 06:15pm Keynote: Francesco Bronzino
Operationalizating Machine Learning Models in Real World Networks
Abstract: Applications of machine learning to networking, from performance diagnosis to security, have conventionally relied on models that are trained on offline packet traces, without regard to the limitations of existing measurement systems nor the cost of gathering, computing, and storing the corresponding input features. As a result, there remains a significant gap between the development of statistical models for network operations and their application and systemization in practice. In this talk, we explore the challenges of operationalizing machine learning models in real world networks. First, we develop new models to infer quality metrics (i.e., startup delay and resolution) for encrypted streaming video services and demonstrate the models are practical through a 16-month deployment in 66 homes. Building on the lessons learned, we design and develop Traffic Refinery, a new framework and system that enables a joint evaluation of both the conventional notions of machine learning performance (e.g., model accuracy) and the systems-level costs of different representations of network traffic. Traffic Refinery makes it possible to explore different representations for learning, balancing systems costs related to feature extraction and model training against model accuracy.
Francesco Bronzino is an Associate Professor (Maître de Conférences) at École normale supérieure de Lyon. His research interests broadly focus on the Internet infrastructure and the services that populate it, particularly studying how to leverage emergent technologies to engineer software systems designed to measure and improve network service performance. His work has been published in top-tier conferences in this area such as ACM Sigmetrics, IEEE/ACM SEC, and PAM. Francesco received his Ph.D. in Electrical and Computer Engineering from WINLAB at Rutgers University, working on the design of name based services for future Internet and mobile network architectures.
06:15pm - 07:00pm: student presentations
PANDYA Himadri: Dispel: avoiding phantom cores in VM's task scheduler
DAN FREEMAN Mahoro: Digital twin of complex systems
THENOT Damien: Using SPDK with the Xen hypervisor
07:30pm Bus leaves for the social dinner
08:00pm Social dinner
07:30am-08:30am Breakfast and Checkout
Session 7 - Privacy
Session chair : Antoine Boutet
08:45am-10:00am Keynote: Oana Goga
How can we safeguard the micro-targeting of political ads? An algorithmic and regulatory perspective
Abstract: Online political advertising has grown significantly over the last few years. To monitor online sponsored political discourse, companies such as Facebook, Google, and Twitter have created public AdLibraries collecting the political ads that run on their platforms. Currently, both policy makers and platforms are debating further restrictions on political advertising to deter misuses. This presentation investigates whether we can reliably distinguish political ads from non-political ads. We take an empirical approach to analyze what kind of ads are deemed political by ordinary people and what kind of ads lead to disagreement. Our results show significant disagreement between what ad platforms, ordinary people, and advertisers consider political and suggest that this disagreement mainly comes from diverging opinions on which adds address social issues. Overall our results imply that it is important to consider social issue ads as political, but they also complicate political advertising regulations.
Oana Goga is a Chargée de Recherches (equivalent to a tenured faculty position) at CNRS in the SLIDE team at Laboratoire d’Informatique de Grenoble, since October 2017. Prior to this I was a postdoc at the Max Planck Institute for Software Systems working with Krishna Gummadi. I obtained my PhD in 2014 from the Pierre et Marie Curie University in Paris under the supervision of Renata Teixeira.
10:00am - 11:00am: student presentations
HUYGHE Maxime: Automated software testing to improve the privacy of browsers
BELGAID Mohammed Chakib: Green Coding : how to reduce the energy consumption of softwares
BECHORFA Mohamed El Amine: Privacy-Preserving Patient Monitoring Application
11:30am - 12:30pm: Lunch
01:00pm: Bus leaves the school towards Grenoble train station