Internship List 2022-2023

Web browsing is one of the main applications of the internet today, as it allows the access to a tremendous catalogue of information and services. Despite the huge progress in terms of network performance, end users still face situations where browsing is slow especially in mobile environments. The reasons are multiple, they range from a slow device, a slow network, a saturated wifi network, to a bad wireless signal either because of distance to the base station (or to the access point) or to a high interference level. Monitoring the network to discover its status and troubleshoot it in case of slow web browsing is of important help to end users, operators and service providers. Many tools has existed so far, mostly of active measurement type requiring the injection of probes and consuming a non negligible amount of data (a speedtest consumes tens if not hundreds of Megabytes). Further, no tool exists today to shed light on the origin of the performance degradation in a general scenario.

We are working at Inria on a project to monitor the network at almost no cost from within the browser. We do that by leveraging the wealth of measurement data available within the browser like the CPU consumption of the device, the page load time, the page rendering time, etc. This data does not require any probing of the network, it is freely available in the browser upon each page visit. With the help of controlled experiments and machine learning, we calibrate models allowing to bridge the available data to network status, and to classify it based on the origin of the problem. We have done this work for chrome browsers on desktop machines. For the troubleshooting part, we did the training and validation on network emulators. The purpose of the project here is to give this work a new dimension by extending it to browsers on mobiles, and to bring it to the level of a plugin that works on end user mobile devices able to run in real environments.

We will start by reviewing the work done so far, and searching the literature for related work on web programming on mobiles (always within the browser). We will be in particular identifying the limitations of the approach, if it brings any new information on web browsing as compared to desktops, and on identifying the possibilities to run machine learning on mobiles on a real time basis to perform both classification and model training (following possibly the federated learning framework). A first extension of the available plugin to mobile browsers, with application to new measurement data specific to mobile devices, is targeted at this level.

The work will then continue towards carrying out real experiments (in the wild or in a real platform like R2Lab) with realistic network conditions towards collecting data and calibrating efficient models able to (i) inform the user about the status of the underlying network and its performance (e.g. download speed, network delay to the web server, quality of wireless signal), without the need to inject probe paquets into the network, and (ii) in case of slow web browsing, pinpoint the root cause of the problem. We will be both collecting data on the device at the browser level, and network speed and delay information serving as ground-truth for the calibration of the machine learning models using open speedtest tools. The final objective of this research work is to come up with a complete data-driven light-weight (i.e. passive with no probing) methodology able to continuously and efficiently monitor and troubleshoot mobile networks leveraging web browsing activity of the end user, and sharing the knowledge with other users towards a more efficient global learning and classification.

The developed approach in this project has several interesting applications that can be the subject of a PhD later for excellent and motivated students.

References:

[1] U. Goel, M. P. Wittie, K. C. Claffy and A. Le, "Survey of End-to-End Mobile Network Measurement Testbeds, Tools, and Services," in IEEE Communications Surveys & Tutorials, vol. 18, no. 1, pp. 105-123, Firstquarter 2016.

[2] Cise Midoglu, Konstantinos Kousias, Özgü Alay, Andra Lutu, Antonios Argyriou, Michael Riegler, Carsten Griwodz, Large scale “speedtest” experimentation in Mobile Broadband Networks, Computer Networks, Volume 184, 2021.

[3] Rui Yang, Ricky K. P. Mok, Shuohan Wu, Xiapu Luo, Hongyu Zou, and Weichao Li. 2022. Design and Implementation of Web-Based Speed Test Analysis Tool Kit. In Passive and Active Measurement: 23rd International Conference, PAM 2022, Virtual Event, March 28–30, 2022, Proceedings. Springer-Verlag, Berlin, Heidelberg, 83–96.

[4] Imane Taibi, Yassine Hadjadj-Aoul, Chadi Barakat. Leveraging Web browsing performance data for network monitoring: a data-driven approach. GLOBECOM 2022 - IEEE Global Communications Conference, Dec 2022, Rio de Janeiro / Hybrid, Brazil. pp.1-6.

[5] Othmane Belmoukadam, Thierry Spetebroot, Chadi Barakat. ACQUA: A user friendly platform for lightweight network monitoring and QoE forecasting. QoE-Management 2019 - 3rd International Workshop on Quality of Experience Management, Feb 2019, Paris, France.

2 - Deanonymizing blockchain users.

Who?

Name: Arnaud Legout

Mail: arnaud.legout@inria.fr

Telephone: +33 4 92 38 78 15

Web page: http://www-sop.inria.fr/members/Arnaud.Legout/

Where?

Place of the project: Inria Sophia Antipolis

Address: 2004 route des Lucioles

Team: DIANA

Web page: https://team.inria.fr/diana/

Pre-requisites if any: Python, blockchain and privacy course, highly motivated

What?

Most public blockchains such as Bitcoin or Ethereum are pseudo anonymous, which means that transactions are associated to addresses, but the owners of these addresses is unknown. The goal of this internship is to find and exploit side channel attacks leveraging on NFTs to deanonymize blockchain users, that is to retrieve the real social identity associated to a blockchain address. You will first have to explore different blockchains that are most likely to be sensitive to side channel attacks, e.g., Ethereum, Tezos, Solana, etc. and select one blockchain to focus on. Then you will have to explore social network on which NFTs are publicized by their owner. Finally, you will have to proof of concept the attack by retrieving the blockchain address of a target. This internship could continue to a Ph.D. for excellent students. For any additional information on the subject (and possible Ph.D. continuation), it is best to directly contact me.

Useful Information/Bibliography:

This internship can be continued by a Ph.D. thesis for excellent students.

3 - Monitoring mobile edge networks

Who?

Name: Chadi Barakat, Thierry Turletti

Mail: Chadi.Barakat@inria.fr, Thierry.Turletti@inria.fr

Web page: http://team.inria.fr/diana/chadi, http://team.inria.fr/diana/thierry-turletti/

Where?

Place of the project: Diana project-team, Inria centre at Université Côté d'Azur

Address: 2004, route des lucioles, 06902 Sophia Antipolis, France

Team: Diana

Web page:http://team.inria.fr/diana/

What?

Pre-requisites if any: Good knowledge in computer and mobile networks, network performance monitoring, and programming languages (C/C++, Python, shell)

Detailed description:

Networks are witnessing a revolution nowadays with the advent of virtualization and softwarization allowing to deploy networks functions and services in data centres, placed at the edge of the network [1,2]. Mobile operators are part of this revolution by opening their network to the content providers and offering them programming possibilities through the software defined networking paradigm. All this is supposed to lead in the near future to programmable mobile edge networks where users and content providers can deploy services at the edge of the network to process data close to where they are generated and consumed. This will considerably increase the bandwidth and decrease the network delay, which is going to be an enabler for a plethora of new services and applications such as IoT, industrial control, gaming, virtual reality, autonomous cars, etc. In some scenarios, it is even envisaged to follow a horizontal cross-operator approach for the collaboration between edge devices and edge clouds, leading to what is called the fog computing paradigm [3]. In all these scenarios, the network edge will be facing the challenge to deploy functions and services in an efficient way, and to orchestrate the communications such that to get the best from the underlying physical infrastructure [4,8]. This cannot be done without the deployment of a monitoring plane allowing to discover and profile the available computing resources at the edge in real time and, in parallel, provide stakeholders (operators, providers and end users) with a sufficient level of information on the capacity and connectivity of these resources to be able to optimize network management and Quality of Experience (QoE) of end users [3,5,6]. Furthermore, this monitoring plane is essential to detect anomalies and troubleshoot the network in case of service degradation.

The challenges towards the aforementioned objectives are multiple. First, QoE of end users is known not to depend on simple network metrics such as the delay, or physical proximity, but rather on a complex set of metrics such as the bitrate in both directions, the jitter, the packet loss rate, the context of mobility, the device properties, etc [7]. The collection of all these metrics in an accurate and timely way represents a real challenge [3,5,6]. Further, and given the large number of devices foreseen at the edge and their mobility and time dynamics, the measurement plane has to be of low cost able to scale with the number of users, devices and services, and to track the whole system in an efficient manner. This is another challenge facing the development of a monitoring plane for future edge networks.

The purpose of this internship is to benchmark the available monitoring solutions for mobile edge networks, and to evaluate their capacity to accommodate the requirements of different applications during the service orchestration phase. First, we will go over the literature to identify and classify the main monitoring solutions in this area based on the service they offer and the scenario they target. We will in particular focus on how information is collected, what information is collected, and to which extent they can improve the service optimization at the edge by taking into account the requirements of applications and the amount of available resources, both in the network and the computing infrastructure. After this first study, we will move to a comparative study of the different monitoring approaches over a set of benchmark scenarios, based on criteria such as the overhead, coverage and granularity.

Second, we will follow an experimental approach to benchmark the main monitoring solutions of mobile edge networks in realistic scenarios. Two wireless technologies could be considered at the edge: WiFi and 5G. In the first case, the study could be done on a network emulator such as Mininet WiFi or with experiments on the R2lab wireless platform [9]; in the second case, the student could have the opportunity to run real experiments on our 5G SophiaNode platform [9]. We will be evaluating their performance (according to the above criteria), and how well the orchestration of resources at the edge can benefit from a larger spectrum of measurement data towards a better management of edge network resources and a better Quality of Experience / Quality of Service for end users. Along this experimental phase, we will make sure to introduce appropriate measurements for missing information about the network and the end-systems and to quantify the gain in performance these additions are able to realize.

The developed approach in this project has several interesting extensions that can be the subject of a PhD or R&D engineering position later for excellent and motivated students according to their profile.

The final objective of this project is to efficiently monitor the edge of future networks for a finer discovery and management of available resources and better quality of service/experience to end users.

Nota: This work is proposed in the context of the Slices-RI European project [10]. SLICES is a flexible platform designed to support large-scale, experimental research focused on networking protocols, radio technologies, services, data collection, parallel and distributed computing and in particular cloud and edge-based computing architectures and services.

References:

[1] Cheol-Ho Hong and Blesson Varghese. 2019. Resource Management in Fog/Edge Computing: A Survey on Architectures, Infrastructure, and Algorithms. ACM Comput. Surv. 52, 5, Article 97 (September 2020), 37 pages.

[2] Meenakshi Syamkumar, Paul Barford, and Ramakrishnan Durairajan. 2018. Deployment Characteristics of "The Edge" in Mobile Edge Computing. In Proceedings of the 2018 Workshop on Mobile Edge Communications (MECOMM'18).

[3] Breno Costa, João Bachiega, Leonardo Rebouças Carvalho, Michel Rosa, Aleteia Araujo, Monitoring fog computing: A review, taxonomy and open challenges, Computer Networks, Volume 215, 2022.

[4] Breno Costa, Joao Bachiega, Leonardo Rebouças de Carvalho, and Aleteia P. F. Araujo. 2022. Orchestration in Fog Computing: A Comprehensive Survey. ACM Comput. Surv. 55, 2, Article 29 (March 2023), 34 pages.

[5] Salman Taherizadeh, Andrew C. Jones, Ian Taylor, Zhiming Zhao, Vlado Stankovski, Monitoring self-adaptive applications within edge computing frameworks: A state-of-the-art review, Journal of Systems and Software, Volume 136, 2018.

[6] Alejandro Cartas, Martin Kocour, Aravindh Raman, Ilias Leontiadis, Jordi Luque, Nishanth Sastry, Jose Nuñez-Martinez, Diego Perino, and Carlos Segura. 2019. A Reality Check on Inference at Mobile Networks Edge. In Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking (EdgeSys '19).

[7] Muhammad Jawad Khokhar, Thierry Spetebroot, Chadi Barakat, “A Methodology for Performance Benchmarking of Mobile Networks for Internet Video Streaming“, in proceedings of the 21st ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWIM), Montreal, Canada, October 2018.

[8] Ilir Murturi and Schahram Dustdar. 2022. DECENT: A Decentralized Configurator for Controlling Elasticity in Dynamic Edge Networks. ACM Trans. Internet Technol. 22, 3, Article 78 (August 2022), 21 pages.

[9] R2lab Wireless platform: https://r2lab.inria.fr

[10] Slices-RI European project: https://slices-ri.eu/

4- Deployment of AETHER in the R2LAB experimental platform

Who?

• Name: Walid Dabbous, Thierry Parmentelat, Damien Saucez, Thierry Turletti

• Mail: Walid.Dabbous@inria.fr, Damien.Saucez@inria.fr, Thierry.Parmentelat@inria.fr, Thierry.Turletti@inria.fr

• Web pages:

• https://team.inria.fr/diana/walid-dabbous/, https://team.inria.fr/diana/damien-saucez/, https://parmentelat.github.io/, http://team.inria.fr/diana/thierry-turletti/

Where?

• Place of the project: Diana project-team, Inria centre at Universite Cote d'Azur

• Address: 2004, route des lucioles, 06902 Sophia Antipolis, France

• Team: Diana

• Web page: http://team.inria.fr/diana/

Pre-requisites if any: DevOps experience is a plus.

Description:

With the densification of cells, Cloud RAN and Multi Access Edge Computing (MEC) solutions are expected to be deployed to support very-low latency and high reliability for new applications in 5G and beyond. Aether is an open source platform proposed by Open Networking Foundation (ONF) to deploy private 5G-connected edges [aether]. This geographically de-centralized platform includes a hybrid cloud instantiation of a mobile core network. It supports both enterprise-scale cellular communications and the deployment of containerized, low-latency edge computing applications.

In this interneship, the student will deploy Aether on a real wireless platform composed of a Kubernetes cluster with real 5G radio devices for different MEC scenarios. The student will develop a complete stack allowing to automatically deploy, configure, and monitor the infrastructure.\

This project has several interesting extensions that can be the subject of a PhD or a R&D engineering position later for excellent and motivated students according to their profile.

This work is proposed in the context of the Slices-RI European project [slices-ri]. SLICES is a flexible platform designed to support large-scale, experimental research focused on networking protocols, radio technologies, services, data collection, parallel and distributed computing and in particular cloud and edge-based computing architectures and services.

Links/Bibliography:

[aether] https://opennetworking.org/aether/

[aiab] https://docs.aetherproject.org/master/developer/aiab.html

[ksniff] https://github.com/eldadru/ksniff

[slices-ri] Slices-RI European project: https://slices-ri.eu/

[5G-vn] Bonati, Leonardo, et al. "Open, programmable, and virtualized 5G networks: State-of-the-art and the road ahead." Computer Networks 182 (2020): 107516.

[aether-testbed] Brassil, Jack. "Investigating Integrated Access and Backhaul on the Aether 5G Testbed." 2021 IEEE 4th 5G World Forum (5GWF). IEEE, 2021.

[slices] Fdida, Serge, et al. "SLICES, a scientific instrument for the networking community." Computer Communications 193 (2022): 189-203.

[cloud-ran-mec] Das, Sandip, Frank Slyne, and Marco Ruffini. "Optimal Slicing of Virtualized Passive Optical Networks to Support Dense Deployment of Cloud-RAN and Multi-Access Edge Computing." IEEE Network 36.2 (2022): 131-138.

5 - Performance analysis of AETHER in 5G cellular networks

Who?

• Name: Walid Dabbous, Thierry Parmentelat, Damien Saucez, Thierry Turletti

• Mail: Walid.Dabbous@inria.fr, Damien.Saucez@inria.fr, Thierry.Parmentelat@inria.fr, Thierry.Turletti@inria.fr

• Web pages:

• https://team.inria.fr/diana/walid-dabbous/, https://team.inria.fr/diana/damien-saucez/, https://parmentelat.github.io/, http://team.inria.fr/diana/thierry-turletti/

Where?

• Place of the project: Diana project-team, Inria centre at Universite Cote d'Azur

• Address: 2004, route des lucioles, 06902 Sophia Antipolis, France

• Team: Diana

• Web page: http://team.inria.fr/diana/

Pre-requisites if any: good knowledge of 5G protocols is a plus

Description:

In this internship, the student will analyse the performance of Aether in a real wireless platform composed of a Kubernetes cluster with real 5G radio devices for different MEC scenarios and optimise the infrastructure and software in order to reach multi-gigabit/sub millisecond communications in the infrastructure.

This project has several interesting extensions that can be the subject of a PhD or a R&D engineering position later for excellent and motivated students according to their profile.

Links/Bibliography:

[aether] https://opennetworking.org/aether/

[aiab] https://docs.aetherproject.org/master/developer/aiab.html

[ksniff] https://github.com/eldadru/ksniff

[slices-ri] Slices-RI European project: https://slices-ri.eu/

[5G-vn] Bonati, Leonardo, et al. "Open, programmable, and virtualized 5G networks: State-of-the-art and the road ahead." Computer Networks 182 (2020): 107516.

[aether-testbed] Brassil, Jack. "Investigating Integrated Access and Backhaul on the Aether 5G Testbed." 2021 IEEE 4th 5G World Forum (5GWF). IEEE, 2021.

[slices] Fdida, Serge, et al. "SLICES, a scientific instrument for the networking community." Computer Communications 193 (2022): 189-203.

6 - Evolution over time of the structure of social graphs: Evaluation of lockdown strategies.

Advisor: Frédéric Giroire and Nicolas Nisse

Emails: frederic.giroire@inria.fr, Nicolas.nisse@inria.fr

Laboratory: COATI project - INRIA (2004, route des Lucioles – Sophia Antipolis)

Web Site:

http://www-sop.inria.fr/members/Frederic.Giroire/

Pre-requisites if any:

Basics in probability and graph theory. Programming in python, C/C++ or java.

Description:

The goal of the project is to develop methods to analyse the evolution over time of a social network.

In the paper [1], the authors propose a model of random networks to study the impact of several types of lockdown policies during the COVID pandemic. The experiments built for France show that completely closing medium and long distance travel to slow down the spread of a random walk is more efficient than than local restrictions. The goal of the PFE will be to extend this work to larger countries, possibly to the world. More precisely, the first step will be to understand the model presented in [1] and the state of the art (for example [2]). The second step will be an implementation of a graph generator following the model. We will study the properties of the generated graphes, e.g. degree distribution, clustering, distances and diameter. The last step will be to analyse the different lockdown policies at a scale larger than France.

This work is part of a larger project studying social networks and their evolutions, see for example [3,4,5].

The internship may be followed by a PhD for interested students.

References.

[1] Chatterji, I., & Lawson, A. (2021). Horospherical random graphs. arXiv preprint arXiv:2112.03535. https://arxiv.org/pdf/2112.03535.pdf

[2] Mauras, S., Cohen-Addad, V., Duboc, G., Dupré la Tour, M., Frasca, P., Mathieu, C., ... & Viennot, L. (2021). Mitigating COVID-19 outbreaks in workplaces and schools by hybrid telecommuting. PLoS computational biology, 17(8), e1009264.

[3] Thibaud Trolliet. Study of the properties and modeling of complex social graphs. Social and Information Networks [cs.SI]. Université Côte d'Azur, 2021. English. https://tel.archives-ouvertes.fr/tel-03468769/document

[4] Frédéric Giroire, Nicolas Nisse, Thibaud Trolliet, Malgorzata Sulkowska. Preferential attachment hypergraph with high modularity. [Research Report] Université Cote d'Azur. 2021. https://hal.inria.fr/hal-03154836

[5] Frédéric Giroire, Nicolas Nisse, Kostiantyn Ohulchanskyi, Malgorzata Sulkowska, Thibaud Trolliet. Preferential attachment hypergraph with vertex deactivation. [Research Report] Inria - Sophia antipolis; UCA, I3S. 2022. https://hal.inria.fr/hal-03655631

7 - Provisioning Network Services for 6G networks using Machine Learning

Advisor: Frédéric Giroire and Stéphane Pérennes

Emails: frederic.giroire@inria.fr,

Laboratory: COATI project - INRIA (2004, route des Lucioles – Sophia Antipolis)

Web Site:

http://www-sop.inria.fr/members/Frederic.Giroire/

Pre-requisites if any:

Knowledge and/or taste in networking and/or machine learning.

Description:

With the advent of next generation networks implementing Software Defined Networks and Network Function Virtualization, network services can be set up dynamically at the right time and at the right place. As networks are now everywhere with the Internet of Things, connected cars and cities, the number of network services to be deployed increases drastically. There thus is a need for new methods to provision them on the fly, which should be scalable and fast. During the project, we will explore how to use new machine learning methods. More specifically, we will study how to apply the methods proposed in [1] for solving algorithmic problems for billion-sized graphs to the provisioning of network services. Indeed, placing services can be seen as covering problems [2].

The internship may be followed by a PhD for interested students.

References.

[1] Manchanda, S., Mittal, A., Dhawan, A., Medya, S., Ranu, S., & Singh, A. (2020). Gcomb: Learning budget-constrained combinatorial algorithms over billion-sized graphs. Advances in Neural Information Processing Systems NIPS, 33, 20000-20011.

[2] Tomassilli, A., Giroire, F., Huin, N., & Pérennes, S. (2018, April). Provably efficient algorithms for placement of service function chains with ordering constraints. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications (pp. 774-782). IEEE.

8- Simulation framework for oneM2M IoT standard

Who

Name: Peraldi Marie-agnes / Liquori Luigi

Mail: map@unice.fr Luigi Liquori@inria.fr

Web page: https://www.i3s.unice.fr/~map/ https://luigiliquori.wixsite.com/atinria

Where?

Place of the project: INRIA – Kairos team

Address: INRIA – Sophia méditérranée

Team: Kairos

Web page: https://team.inria.fr/kairos/

What? Simulation framework for oneM2M IoT standard

Pre-requisites if any:

Interest for

- Modelling distributed systems

- Concepts of discrete time vs simulation time

- Java/eclipse knowledge is welcome

Keywords : IoT deployment, specification language, temporal behavior, scenario, oneM2M, Omnet++ simulation

Detailed description:

Context

Kairos team is highly involved in ETSI standardization bodies around the topic of IoT systems modeling and simulation. oneM2M, the global standard initiative for M2M communications and the IoT, is now mature and multiple deployments exist all over the world at both experimental and operational levels. We focus on the specific concerns of performance evaluation of the deployments related to the standard and their implementation:

What is the best topology to deploy an IoT infrastructure/application in terms of gateways and servers?

How many devices can be connected to a given IoT gateway with a certain network technology?

Will a given IoT platform fulfills the constraints of a targeted application?

Is it possible to have an approximation of the response time of an IoT application when using a given oneM2M IoT platform?

What will be the behavior of a target system if the number of devices shall be doubled or if data requests of the applications increase by x% next year?

Etc.

These are classical questions that a telecom operator has to answer about his network at the design and planification phase. This requires both a good characterization of the production of models and tools allowing to simulate / emulate a oneM2M platform within a targeted ecosystem, the characterization of an IoT application, a test/simulation environment, the extraction of adequate quality of service metrics.

Objective of intership

The internship is defined in the context of an ETSI Testing Task Force (TTF) on Performance Evaluation and analysis of oneM2M Planning and deployment. This task force started in November for 2 years. Kairos participated in this task force.

The objective is to define a data model and associated behavioral model to characterize an IoT distributed application, the targeted platform, and the deployment scenarios of the application on the platform.

The Kairos team is largely involved in temporal models for highly constrained cyber-physical systems such as in avionics or in the automotive domain. The DSL (Domain Specific Language) approach (Gemoc) using formal methods is one of the approaches developed in the team to specify and verify the behavior of these critical systems.

We want to experiment this DSL approach to characterize IoT applications, platforms and their deployment.

To do so, the student will have to draw inspiration from existing works in the field and propose at least for distributed IoT applications a temporal DSL allowing to make an executable model for the future. The execution target is the Omnet++ discrete simulator.

Thus, the student will :

review the literature on temporal DSLs and their use in application deployment [1] [2], etc

identify the artifacts useful to this DSL for the IoT domain [3], and develop these artifacts in a oneM2M DSL with GEMOC [5]

Identify the KPIs (Key Performance Indexes) that can be deduced from an analysis of this deployment [4], etc

Produce an Omnet++ simulator able to simulate infrastructure and application developed with the DSL

Préliminary works have already been conducted by a student last year and a preliminary POC (proof of oncept) has been developed.

The idea is too go deeper in the work to extend the oneM2M artifacts not already studied.

References: set of bibliographical references (article, books, white papers, etc) to be read by the student before starting to work on this subject

[1] Jörg Holtmann, Julien Deantoni, Markus Fockel. Early timing analysis based on scenario requirements and platform models. Software and Systems Modeling, Springer Verlag, In press. hal-033750 https://hal.inria.fr/hal-03375049/file/Early_Timing_Analysis_based_on_Scenario_Requirements_and_Platform_Models.pdf

[2] A. Goknil, M-A Peraldi-Frati.

A DSL for Specifying Timing Requirements. . In Mode-Driven Requirement Workshop of the RE 2012 conference, Chicago, USA, Sep 2012, MoDRE Proceedings https://orbilu.uni.lu/bitstream/10993/12561/1/Modre2012_V11.pdf

[3] Luigi Liquori, Marie-Agnès Peraldi-Frati, Andrea Cimmino, Seung Myeong Jeong, Joachim Koss et al. SmartM2M; Study for oneM2M Discovery and Query use cases and requirements ETSI SmartM2M Working Group, 2019.https://hal.inria.fr/hal-03115482

[4] TS 103716: ETSI SmartM2M; oneM2M Discovery and Query solution(s) simulation and performance evaluation, \url{https://hal.inria.fr/hal-03261059}

[5] Site web gemoc studio https://projects.eclipse.org/projects/modeling.gemoc

9- Modeling the Carbon Footprint of Media Streaming

Who?

Name: Joanna Moulierac

Mail: joanna.moulierac@univ-cotedazur.fr

Web page:

http://www-sop.inria.fr/members/Joanna.Moulierac/

Name: Guillaume Urvoy-Keller

Mail: guillaume.urvoy-keller@univ-cotedazur.fr

Web page: https://www.i3s.unice.fr/~urvoy/

Where?

Place of the project: INRIA

Address: 2004, route des Lucioles

B.P. 93 - F-06902 Sophia Antipolis Cedex

Team: Coati (I3S/INRIA) and SigNet (I3S)

Web page: https://team.inria.fr/coati/ and https://signet.i3s.univ-cotedazur.fr/

Pre-requisites if any:

Description:

Video, especially streaming, represents the majority of the Internet traffic and drives a significant share of the investment in terms of data centers and Internet links. Some researchers have proposed methods to estimate the carbon footprint of a typical streaming service [1,5,7,9]. Reference [1] is a good entry point. They rely on a classical three-tier approach where one associates a cost to the end user equipment, the network (access and core) and the data center. They not only consider the usage cost, i.e. the electricity cost (later translated into eCO2 as a function of the energy mix of the considered country), but also the production cost of the equipment. The latter can be seen as a debt that is divided by the number of years during which the device (router, server, smartphone) is used. The production cost can represent the majority of the footprint if the lifetime of the equipment is short, as is typically the case for a smartphone.

The objective of this internship is manifold:

Dissect the methodology used in [1] and the other papers in this domain to precisely understand the underlying hypotheses and also the sources of data used for the computation. For instance, the energy efficiency of each new generation of equipment increases, which reduces the usage cost, albeit at the expense of the production cost.

Understand the quite wide variations among the existing results. This might be related to the year in which the model is proposed since the energy efficiency increases steadily in ICT. This might also be related to the hypotheses made on the datacenter usage (dedicated, shared with other services) or if the production costs are taken into account or not.

Extend the previous study by taking into account for a streaming session, if possible:

Which Streaming Platform is used (dedicated, shared),

Bandwidth used, i.e. quality levels,

Wired or 4G 5G Wifi connection,

Perceived user quality,

Time of viewing during the day,

Distance to the data center,

Number of people watching (popular video),

Equipment renewal rate,

Equipment on which we watch (computer, telephone, tablet?) including its OS

For this part, the student can consider different tools to estimate the energy efficiency of a streaming session such as wattmeters, CPU consumption, carbonalizer plugin, etc...

Adapt the methodology to the case of a country like France for which, recently, a number of studies have provided up-to-date information for typical and recent networks [2] and data centers (both the servers [3] and the storage [4]) on which we could rely to exemplify the case of streaming.

Deploy a simple testbed consisting of a streaming server ,e.g. ffmpef, a client and an interconnection network on which we could do direct power consumption measurements using wattmetters and test different load factors (number of clients in parallel, resolutions of videos, etc).

Useful Information/Bibliography:

[1] Stephen Makonin, Laura U. Marks, Radek Przedpelski, Alejandro Rodriguez-Silva, Ramy ElMallah.Calculating the Carbon Footprint of Streaming Media: Beyond the Myth of Efficiency. LIMITS: Eighth Workshop on Computing within Limits 2022.

[2] Marion Ficher, Francoise Berthoud, Anne-Laure Ligozat, Patrick Sigonneau, Maxime Wissle, Badis Tebbani: Assessing the carbon footprint of the data transmission on a backbone network. ICIN 2021: 105-109

[3] Francoise Berthoud, Bruno Bzeznik, Nicolas Gibelin, Myriam Laurens, Cyrille Bonamy et al. Estimation de l'empreinte carbone d'une heure.coeur de calcul [Rapport de recherche] UGA - UniversitÃ© Grenoble Alpes; CNRS; INP Grenoble; INRIA. 2020

[4] Guillaume Charret , Alexis Arnaud, Francoise Berthoud, Bruno Bzeznik, Anthony Defize et al. Estimation de l'empreinte carbone du stockage de donnees [Rapport de recherche] CNRS - GRICAD. 2020

[5] https://www.iea.org/commentaries/the-carbon-footprint-of-streaming-video-fact-checking-the-headlines

[6] Oche Ejembi and Saleem N Bhatti. 2015. Client-side Energy Costs of Video Streaming. In 2015 IEEE International Conference on Data Science and Data Intensive Systems. IEEE, Sydney, Australia, 252–259

[7] Laura U. Marks, et al. 2021. Tackling the Carbon Footprint of Streaming Media: Updated October 2021. Technical Report. Simon Fraser University.

[8] Arman Shehabi, Ben Walker, and Eric Masanet. 2014. The energy and greenhouse-gas implications of internet video streaming in the United States. Environmental Research Letters 9, 5 (2014), 054007

[9] Stephens, A., Tremlett-Williams, C., Fitzpatrick, L., Acerini, L., Anderson, M., & Crabbendam, N. (2021). Carbon impact of video streaming. https://policycommons.net/artifacts/2387662/carbon-impact-of-video-streaming/3408674/

10- Cooperative Machine Learning Inference

Who?

Name: Giovanni Neglia, Alain Jean-Marie

Mail: giovanni.neglia@inria.fr, alain.jean-marie@inria.fr

Web page: http://www-sop.inria.fr/members/Giovanni.Neglia/

Where?

Place of the project: Inria

Address: 2004 route des Lucioles, 06902 Sophia Antipolis

Team: NEO team

Web page: https://team.inria.fr/neo/

Pre-requisites if any: The ideal candidate should like math and

analytical reasoning and have strong programming skills. A background on

machine learning would be a plus.

Description:

An increasing number of applications rely on complex inference tasks

that are based on machine learning (ML). Currently, there are two

options to run such tasks: either they are served directly by the end

device (e.g., smartphones, IoT equipment, smart vehicles), or offloaded

to a remote cloud. Both options may be unsatisfactory for many

applications: local models may have inadequate accuracy, while the cloud

may fail to meet delay constraints. In [1], we presented the novel idea

of inference delivery networks (IDNs), networks of computing nodes that

coordinate to satisfy ML inference requests achieving the best trade-off

between latency and accuracy. IDNs bridge the dichotomy between device

and cloud execution by integrating inference delivery at the various

tiers of the infrastructure continuum (access, edge, regional data

center, cloud).

Nodes with heterogeneous capabilities can store a set of monolithic

machine learning models with different computational/memory requirements

and different accuracy and inference requests can be forwarded to other

nodes if the local answer is not considered to be accurate enough.

In this project, we want to explore the possibility to enlarge the set

of actions for nodes in an inference delivery network beyond the simple

inference forwarding, by allowing models to be split across multiple

nodes [3,4] and/or inferences from different nodes to be opportunely

combined to improve their quality. In particular, we aim to compare

specific model splitting techniques, with or without the insertion of

bottlenecks [2], in terms of performance metric like inference delay and

network load. We will evaluate different methodologies to estimate

online the quality of an inference [5], and propose distributed bagging

algorithms to combine inferences from different models [6-9].

This research topic can lead to a PhD position. We are then looking for

students with a strong motivation to pursue a research career.

Useful Information/Bibliography:

[1] T. Si Salem, G. Castellano, G. Neglia, F. Pianese and A. Araldo,

Towards Inference Delivery Networks: Distributing Machine Learning with

Optimality Guarantees, 19th Mediterranean Communication and Computer

Networking Conference (MedComNet), 2021, an extended version is under

submission to IEEE/ACM Trans. on Networking.

[2] G. Castellano, F. Pianese, D. Carra, T. Zhang, G. Neglia,

Regularized Bottleneck with Early Labeling, Proceedings of IEEE ITC,

Shenzhen, China, Sept. 2022

[3] Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung.

Branchynet: Fast inference via early exiting from deep neural networks.

In 2016 23rd International Conference on Pattern Recognition (ICPR),

pages 2464–2469. IEEE, 2016

[4] Yoshitomo Matsubara, Marco Levorato, and Francesco Restuccia. 2022.

Split Computing and Early Exiting for Deep Learning Applications: Survey

and Research Challenges. ACM Comput. Surv. Just Accepted (March 2022)

[5] Bella, Antonio, et al. "Calibration of machine learning models."

Handbook of Research on Machine Learning Applications and Trends:

Algorithms, Methods, and Techniques. IGI Global, 2010. 128-146.

[6] Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140,

Aug 1996. ISSN 1573-0565. doi: 10.1023/A:1018054314350. URL

https://doi.org/10.1023/A:1018054314350. 2 11

[7] David H. Wolpert. Original contribution: Stacked generalization.

Neural Netw., 5(2):241–259, February 1992

[8] Robert E. Schapire. A brief introduction to boosting. In Proceedings

of the 16th International Joint Conference on Artificial Intelligence -

Volume 2, IJCAI’99, pages 1401–1406, San Francisco, CA, USA, 1999.

Morgan Kaufmann Publishers Inc. URL

[9] Singh, S. P. and Jaggi, M. Model fusion via optimal transport.

NeurIPS - Advances in Neural Information Processing Systems, 33, 2020.

11- Incentives to federated learning

Who?

Name: Giovanni Neglia, Alain Jean-Marie

Mail: giovanni.neglia@inria.fr, alain.jean-marie@inria.fr

Web page: http://www-sop.inria.fr/members/Giovanni.Neglia/

Where?

Place of the project: Inria

Address: 2004 route des Lucioles, 06902 Sophia Antipolis

Team: NEO team

Web page: https://team.inria.fr/neo/

Pre-requisites if any: The ideal candidate should like math and

analytical reasoning and have strong programming skills. A background on

optimization or machine learning would be a plus.

Description:

The increasing size of data generated by smartphones and IoT devices

motivated the development of Federated Learning (FL) [1,2], a framework

for on-device collaborative training of machine learning models. FL

algorithms like FedAvg [3] allow clients to train a common global model

without sharing their personal data; FL reduces data collection costs

and protects clients' data privacy.\

At the same time, clients' local datasets may be drawn from different

distributions and the global model may be unsatisfactory for a given

client, who may then prefer to train a local model autonomously. This

issue is mitigated by new FL algorithms which enable model

personalization at the client level [4,5]. In order to prevent clients'

defection, it is also possible to incentivize clients' participation.

The goal of this research project is to overview the different

approaches to promote clients' participation to FL training, ranging

from game-theoretic studies [6-8], clients' incentives for contributing

data and computation resources [9-10], personalization approaches

[4,5,12,13], and new approaches explicitly maximizing the fraction of

clients incentivized to use the global model [14].

This research topic can lead to a PhD position. We are then looking for

students with a strong motivation to pursue a research career.

Useful Information/Bibliography:

[1] Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith.

Federated learning: Challenges, methods, and future directions. IEEE

Signal Processing Magazine, 37 (3):50\'9660, 2020.

[2] Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurelien Bellet,

Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles,

Graham Cormode, Rachel Cummings, et al. Advances and open problems in

federated learning. arXiv preprint arXiv:1912.04977, 2019.

[3] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and

Blaise Aguera y Arcas. Communicationefficient learning of deep networks

from decentralized data. In Artificial Intelligence and Statistics,

pages 1273\'961282. PMLR, 2017.

[4] Othmane Marfoq, Giovanni Neglia, Aurelien Bellet, Laetitia Kameni,

and Richard Vidal. Federated multi-task learning under a mixture of

distributions, NeurIPS 2021

[5] Othmane Marfoq, Giovanni Neglia, Laetitia Kameni, Richard Vidal,

Personalized Federated Learning through Local Memorization, ICML 2022.

[6] Xuezhen Tu, Kun Zhu, Nguyen Cong Luong, Dusit Niyato, Yang Zhang,

and Juan Li. Incentive mechanisms for federated learning: From economic

and game theoretic perspective. arXiv preprint arXiv:2111.11850, 2021.

[7] Kate Donahue and Jon Kleinberg. Model-sharing games: Analyzing

federated learning under voluntary participation. In The Thirty-Fifth

AAAI Conference on Artificial Intelligence (AAAI-21), 2021.

[8] Kate Donahue and Jon Kleinberg. Optimality and stability in

federated learning: A game-theoretic approach. In Advances in Neural

Information Processing Systems, 2021.

[9] Avrim Blum, Nika Haghtalab, Richard Lanas Phillips, and Han Shao.

One for one, or all for all: Equilibria and optimality of collaboration

in federated learning. In International Conference on Machine Learning,

2021.

[10] Jingoo Han, Ahmad Faraz Khan, Syed Zawad, Ali Anwar, Nathalie

Baracaldo Angel, Yi Zhou, Feng Yan, and Ali R. Butt. Tokenized incentive

for federated learning. In Proceedings of the Federated Learning

Workshop at the Association for the Advancement of Artificial

Intelligence (AAAI) Conference, 2022.

[11] Jiawen Kang, Zehui Xiong, Dusit Niyato, Han Yu, Ying-Chang Liang,

and Dong In Kim. Incentive design for efficient federated learning in

mobile networks: A contract theory approach. In 2019 IEEE

VTS Asia Pacific Wireless Communications Symposium (APWCS), 2019.

[12] Valentina Zantedeschi, Aurelien Bellet, and Marc Tommasi. Fully

decentralized joint learning of personalized models and collaboration

graphs. volume 108 of Procexedings of Machine Learning Research, pages

864-874, 2020. PMLR.

[13] Tian Li, Shengyuan Hu, Ahmad Beirami, and Virginia Smith. Ditto:

Fair and robust federated learning through personalization. In

International Conference on Machine Learning, pages 6357-6368. PMLR, 2021.

[14] Yae Jee Cho, Divyansh Jhunjhunwala, Tian Li, Virginia Smith, Gauri

Joshi, To Federate or Not To Federate: Incentivizing Client

Participation in Federated Learning, arXiv:2205.14840

12- Cooperative Machine Learning Inference

Who?

Name: Giovanni Neglia, Tahar Nabil, Richard Niamke (EDF Research Engineers)

Mail: giovanni.neglia@inria.fr, tahar.nabil@edf.fr, richard.niamke@edf.fr

Web page: http://www-sop.inria.fr/members/Giovanni.Neglia/

Where?

Host unit: SOAD Group (Statistics and Decision Support Tools), SEQUOIA

department of EDF Lab

Paris-Saclay, 7 boulevard Gaspard Monge, 91120 Palaiseau. Desired start:

as soon as possible in 2023.

Pre-requisites if any:

- Desire to continue the work in the as part of a CIFRE EDF thesis.

- Areas of expertise: optimization, machine learning, statistics.

- Good knowledge of neural networks and programming in Python.

- Good knowledge of federated learning and its applications would be a plus.

- Teamwork, writing skills, dynamic, motivated

Context

EDF R&D (1,800 researchers) has as its main missions to contribute to

improving the performance of the EDF Group's operational units, to

identify and prepare growth drivers in the medium and long term. In this

context, the multidisciplinary department SEQUOIA (Services, Economy,

Human Questions, Innovative tools and AI) combines data science, AI,

human and social sciences, economics to provide support for the

development and delivery of offers, services and tools to the Group's

operational departments. Within this department, the intern will be

attached to the SOAD group (Statistics and Decision Support Tools) which

has around twenty research engineers specialized in data science, data

engineering, decision-making computing and text mining whose mission is

to build and implement the methods

analysis, search and enrichment of voluminous data of multiple,

structured or complex origins. The intern will be required to interact

and evolve in a collaborative framework with other researchers working

on issues common to the EDF Group.

Goals

Federated learning (FL) is a decentralized learning paradigm of a

statistical model, by the collaboration of different client nodes under

the orchestration of a central server. By not centralizing the data,

this framework therefore structurally offers better protection and data

governance [1]. However, to date, FL algorithms have been little

implemented in the energy sector, and it remains to better understand

their advantages and possible limits for an energy company such as EDF.

This is the subject of this internship. The goals will be:

1. Initially, to carry out a state of the art relating both (i) to

federated learning, by appropriating the classic approaches and the main

challenges [2-3], and (ii) to existing applications to issues in the

energy sector (in particular on the following subjects: Smart Home,

Smart Grid, demand or production forecasting – non-exhaustive list).

2. Then to apply state-of-the-art algorithms in federated learning on

EDF data (time series, sensor data, etc.). The trainee will focus in

particular on characterizing the use case(s) from an FL point of view

and estimating the performance gap compared to the centralized case.

3. Finally, to identify the main scientific obstacles posed by the

application of FL to the data of an energy company such as EDF

(heterogeneity of data, vertical learning, guarantees of privacy, etc.),

so as to, if necessary where appropriate, conclude by formulating a

research problem that could lead to the launch of a doctoral thesis.

Useful Information/Bibliography:

[1] CNIL, "Chacun chez soi et les données seront bien gardées", 2022.

[2] McMahan B., Moore E., Ramage D., Hampson S. and Arcas B. A.,

"Communication-efficient learning of deep

networks from decentralized data", Proc. of the 20th Int. Conf. on

Artificial Intelligence and Statistics, PMLR 54:1273-

1282, 2017.

[3] Kairouz P., McMahan B., Avent B., Bellet A., Bennis M., et al.,

"Advances and Open Problems in Federated Learning",

Foundations and Trends in Machine Learning, Now Publishers, 4 (1-2),

pp.1-210, 2021.

- Federated Multi-Task Learning under a Mixture of Distributions, Marfoq

O., Neglia G., Bellet A.,

Kameni L. and Vidal R., in Proc. of the Thirty-fifth Conference on

Neural Information Processing

Systems (NeurIPS 2021), 2021. https://arxiv.org/abs/2108.10252

- Efficient Passive Membership Inference Attack in Federated Learning,

Zari, O., Xu C. and Neglia G.,

in NeurIPS workshop on Privacy in Machine Learning (PriML), 2021.

https://arxiv.org/abs/2111.00430

- What else is leaked when eavesdropping Federated Learning? Xu C. and

Neglia G., in ACM CCS

workshop on Privacy Preserving Machine Learning (PPML), 2021. http://www-

sop.inria.fr/members/Giovanni.Neglia/publications/xu21ppml

- Clustered Sampling: Low–Variance and Improved Representativity for

Clients Selection in Federated

Learning, Fraboni Y., Vidal R., Kameni L. and Lorenzi M., in

International Conference on Machine

Learning (ICML 2021), 2021.

https://proceedings.mlr.press/v139/fraboni21a.html

- Free–rider Attacks on Model Aggregation in Federated Learning, Fraboni

Y., Vidal R. and Lorenzi M.,

in Proc of the 24th International Conference on Artificial Intelligence

and Statistics (AISTATS 2021),

2021. http://proceedings.mlr.press/v130/fraboni21a.html

- A Probabilistic Framework for Modeling the Variability Across

Federated Datasets of Heterogeneous

Multi–View Observations, Balelli I., Silva S. and Lorenzi M, in

International Conference on Information

Processing in Medical Imaging, 2021. https://hal.archives-ouvertes.fr/hal-

03152886/file/Balelli_IPMI2021_CameraReady.pdf

- Throughput-Optimal Topology Design for Cross-Silo Federated Learning,

Marfoq O., Neglia G and

Vidal R., in Proc. of the Thirty-fourth Conference on Neural Information

Processing Systems (NeurIPS

2020), 2020. https://arxiv.org/abs/2010.12229

- Decentralized gradient methods: does topology matter?, Neglia G., Xu

C., Towsley D. and Calbi G.,

in Proc. of the 23rd International Conference on Artificial Intelligence

and Statistics (AISTATS), 2020.

https://arxiv.org/pdf/2002.12688.pdf

- The Role of Network Topology for Distributed Machine Learning, Neglia

G., Calbi T., Towsley D. and

Vardoyan G., in Proc. of the IEEE International Conference on Computer

Communications (INFOCOM

2019), 2019.

http://www-sop.inria.fr/members/Giovanni.Neglia/publications/neglia19infocom.pdf

- On The Impact of Client Sampling on Federated Learning Convergence,

Fraboni Y., Vidal R., Kameni L.

and Lorenzi M., in arXiv preprint arXiv:2107.12211, 2019.

https://arxiv.org/pdf/2107.12211.pdf

13- A digital twin for intelligent surfaces aided cellular network infrastructures

Who?

Name: Walid Dabbous, Damien Saucez, Chadi Barakat

Mail:

Walid.Dabbous@inria.fr, Damien.Saucez@inria.fr, Chadi.Barakat@inria.fr.

Web page:

https://team.inria.fr/diana/

Where?

Place of the project:

Address:

DIANA team

Inria Sophia Antipolis – Méditerranée

2004 Route des Lucioles – BP-93

06902 Sophia Antipolis CEDEX

FRANCE

Team:

DIANA

Web page:

https://team.inria.fr/diana/

Pre-requisites if any: Good knowledge of network and 3GPP protocols is required. Good understanding of classical electromagnetism is required. The work may require the student to travel in Europe.

Description:

The DIANA team studies deployment of experimental platforms leveraging the so-called intelligent surfaces to optimise cellular networks. A general issue in wireless testing infrastructures is that phy components cannot be used simultaneously by two different experiments as it would cause interferences that would invalidate results. For this reason, we are building a virtual digital twin of the infrastructure combining cloud resources and simulated networks. The particularity of this digital twin is that it must be capable of running experiments that are supposed to be run on the real infrastructure without changes. It therefore relies mostly on emulators and wrappers to hide the fact that the experiments is actually performed on the digital twin.

During the internship, the student will first define and design a canonical intelligent surface model. The student will then integrate this model in the digital twin of the experimental testbed composed cellular equipment and intelligent surfaces.

Useful Information/Bibliography:

[Bas2019] E. Basar, M. Di Renzo, J. De Rosny, M. Debbah, M. -S. Alouini and R. Zhang, "Wireless Communications Through Reconfigurable Intelligent Surfaces," in IEEE Access, vol. 7, pp. 116753-116773, 2019, doi: 10.1109/ACCESS.2019.2935192.

[Nis2021] Nishio, T., Koda, Y., Park, J., Bennis, M., & Doppler, K. (2021). When wireless communications meet computer vision in beyond 5G. IEEE Communications Standards Magazine, 5(2), 76-83.

[Muk2021] Mukhtar, H., & Erol-Kantarci, M. (2021, September). Machine learning-enabled localization in 5g using lidar and rss data. In 2021 IEEE Symposium on Computers and Communications (ISCC) (pp. 1-6). IEEE. [Cha2021] Charan, G., Alrabeiah, M., & Alkhateeb, A. (2021). Vision-aided 6G wireless communications: Blockage prediction and proactive handoff. IEEE Transactions on Vehicular Technology, 70(10), 10193-10208. [Vac2021] Vaca-Rubio, C. J., Ramirez-Espinosa, P., Kansanen, K., Tan, Z. H., De Carvalho, E., & Popovski, P. (2021). Assessing wireless sensing potential with large intelligent surfaces. IEEE Open Journal of the Communications Society, 2, 934-947.

[Wil2021] Wild, T., Braun, V., & Viswanathan, H. (2021). Joint design of communication and sensing for beyond 5G and 6G systems. IEEE Access, 9, 30845-30857.

[Shar2022] Sharma, S., Urumkar, S., Fontanesi, G., Ramamurthy, B., & Nag, A. (2022). Future Wireless Networking Experiments Escaping Simulations. Future Internet, 14(4), 120.

[Tay2020] M. M. Taygur and T. F. Eibert, "A Ray-Tracing Algorithm Based on the Computation of (Exact) Ray Paths With Bidirectional Ray-Tracing," in IEEE Transactions on Antennas and Propagation, vol. 68, no. 8, pp. 6277-6286, Aug. 2020, doi: 10.1109/TAP.2020.2983775.

[Liu2021] Y. Liu, S. Crisp, and D. Blough, "Performance Study of Statistical and Deterministic Channel Models for mmWave Wi-Fi Networks in ns-3," Proc. of the Workshop on ns-3, pp. 33-40, 2021.

[Yur2016a] O. Yurduseven, V. R. Gowda, J. N. Gollub and D. R. Smith, "Printed Aperiodic Cavity for Computational and Microwave Imaging," IEEE Microwave and Wireless Components Letters, vol. 26, no. 5, pp. 367-369, 2016.

[Ima2020] M. F. Imani, J. N. Gollub, O. Yurduseven et al., "Review of Metasurface Antennas for Computational Microwave Imaging," IEEE Transactions on Antennas and Propagation, vol. 68, no. 3, pp. 1860-1875, 2020. [Ren2020] Di Renzo, M., Zappone, A., Debbah, M., Alouini, M. S., Yuen, C., De Rosny, J., & Tretyakov, S. (2020). Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and the road ahead. IEEE Journal on Selected Areas in Communications, 38(11), 2450-2525. [Yur2016b] O. Yurduseven, J. N. Gollub, A. Rose, D. L. Marks and D. R. Smith, "Design and Simulation of a Frequency-Diverse Aperture for Imaging of Human-Scale Targets," IEEE Access, vol. 4, pp. 5436-5451, 2016. [Gol2017] J. N. Gollub, O. Yurduseven, O., et al. "Large Metasurface Aperture for Millimeter wave Computational Imaging at the Human-Scale," Scientific Reports, vol. 7, no. 1, pp.1-9, 2017.

[Los2021] Loscri, V., Vegni, A. M., Innocenti, E., Giuliano, R., & Mazzenga, F. (2021, June). A joint Computer Vision and Reconfigurable Intelligent Meta-surface Approach for Interference Reduction in Beyond 5G Networks. In 2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR) (pp. 1-6). IEEE.

[Des2020] Deshpande, A. M., Telikicherla, A. K., Jakkali, V., Wickelhaus, D. A., Kumar, M., & Anand, S. (2020). Computer vision toolkit for non-invasive monitoring of factory floor artifacts. Procedia Manufacturing, 48, 1020-1028.

[Ren2015] Ren, S., He, K., Girshick, R.B., & Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137- 1149.

[Red2016] Redmon, J., Divvala, S.K., Girshick, R.B., & Farhadi, A. (2016). You Only Look Once: Unified, Real- Time Object Detection. 2016 IEEE Conference on Computer Vision and Patt

14- Clustering on geometric graphs

Who?

Name: Vinay Kumar B. R. and Konstantin Avrachenkov

Mail: vinay-kumar.bindiganavile-ramadas@inria.fr, k.avrachenkov@sophia.inria.fr

Web page: https://sites.google.com/view/vinaykumarbr/home, https://www-sop.inria.fr/members/Konstantin.Avratchenkov/me.html

Where?

Place of the project: INRIA Sophia Antipolis

Address: 2004 route des Lucioles, 06902 Sophia Antipolis

Team: NEO

Web page: https://team.inria.fr/neo/

Pre-requisites if any: Python programming, Basic probability theory and linear algebra desirable

Description: Many real-world networks exhibit geometric properties. For example, in social networks, the property of "friends of friends are friends", which arises from the triangle inequality, is predominant. Similarly, in co-authorship networks, it is observed that two researchers who are geographically close (same institute or country, say) tend to collaborate more. These can be modelled as random graphs with an underlying geometric structure. The first part of this project comprises of detecting geometry in practical networks from the Stanford Network dataset (see [1]). Since geometric networks are known to have more triangles, fast triangle-count based algorithms, as in [2], could be used for the same (see also [3] and [4]). The network datasets so shortlisted will be used in the subsequent part of the work.

Classification or community detection on such geometric graphs is an important problem (see [5-7]). As examples, one might be interested in classifying researchers by their areas based on observing the co-authorship network, or in the case of social networks, one might want to segregate the underlying population into groups of close friends. The second part of the project involves proposing and evaluating algorithms for community detection in the semi-supervised learning framework. This means that the community allocations of a fraction of the vertices is known before-hand, and using them it is required to recover the communities of the remaining nodes. Algorithms based on spectral clustering (see [8,9]) and graph neural netowrks (GNN) (see [10-12]) are to be investigated in this direction.

The final stage of the project brings together the above elements by implementing the clustering algorithms on the datasets that were shortlisted. Comparison with existing GNN based methods and on related network models are to be carried out.

In conclusion, the goal of the project is to build a framework for community recovery on geometric networks. This could be viewed as a software package that can be imported to detect and perform clustering on geometric datasets. Additionally, the algorithms that are proposed and analyzed could go on to be published in reputed conferences and journals.

Useful Information/Bibliography:

[1] https://snap.stanford.edu/data/

[2] Tsourakakis, C.E., 2011. Counting triangles in real-world networks using projections. Knowledge and Information Systems, 26(3), pp.501-520.

[3] Ugander, J., Karrer, B., Backstrom, L. and Marlow, C., 2011. The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503.

[4] Bubeck, S., Ding, J., Eldan, R. and Rácz, M.Z., 2016. Testing for high‐dimensional geometry in random graphs. Random Structures & Algorithms, 49(3), pp.503-532.

[5] Galhotra, S., Mazumdar, A., Pal, S. and Saha, B., 2018, April. The geometric block model. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1).

[6] Abbe, E., Baccelli, F. and Sankararaman, A., 2021. Community detection on Euclidean random graphs. Information and Inference: A Journal of the IMA, 10(1), pp.109-160.

[7] Chien, E., Tulino, A. and Llorca, J., 2020, April. Active learning in the geometric block model. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 04, pp. 3641-3648).

[8] Von Luxburg, U., 2007. A tutorial on spectral clustering. Statistics and computing, 17(4), pp.395-416.

[9] Avrachenkov, K., Bobu, A. and Dreveton, M., 2021. Higher-order spectral clustering for geometric graphs. Journal of Fourier Analysis and Applications, 27(2), pp.1-29.

[10] M. Kamalov and K. Avrachenkov. GenPR: generative PageRank framework for semi-supervised learning on citation graphs. In Conference on Artificial Intelligence and Natural Language, pages 158–165. Springer, 2020.

[11] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip. A comprehensive survey on graph neural networks. IEEE transactions on neural

networks and learning systems, 32(1):4–24, 2020.

[12] Grover, A. and Leskovec, J., 2016, August. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 855-864).

15- Evaluating the performance of Data Streaming Systems on Kubernetes

Who?

Name: Fabrice Huet

& Dino Lopez Pacheco

Mail:

fabrice.huet@univ-cotedazur.fr

dino.lopez@univ-cotedazur.fr

Web page:

https://scale.i3s.unice.fr/ &

https://signet.i3s.unice.fr/

Where?

Place of the project:

I3S Laboratory

Address: 2000 route des lucioles, 06902 Sophia Antipolis

Team: Scale and Signet

Web page:https://scale.i3s.unice.fr/

& https://signet.i3s.unice.fr/

What?

Modern applications

are no more monolithic but are built using a modular design which enables (i) to decouple the different fragments corrresponding to the various functions of the applications and (ii) to scale independently each of these fragments. These fragments can be hosted in VMs or inside containers [4], with the added support of orchestration systems such as Kubernetes [5].

Due to the large amount of fast data produced today, Data Streaming Systems such as Storm, Heron or Spark Streaming [1,2,3] are widely deployed . However, the underlying architecture of such systems may not be suited for today microservices architecture. The subject of this internship is be to evaluate the possibility of putting a data streaming framework such as Apache Storm on Kubernetes, to benchmark the influence of resources placement among a cluster as well as the influence of the underlying Container Network Interface and various network protocols used to interconnect the containers.

Pre-requisites if any:

- Basic knowledge

of Java and C

- Proficiency with command line tools and Linux

References: set of bibliographical references (article, books, white papers, etc) to be read by the student before starting to work on this subject

[1] Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, and Dmitriy Ryaboy. Storm@twitter. In Proceedings of the 2014 ACM SIGMOD International

[2] Conference on Management of Data, SIGMOD ’14, pages 147–156, New York, NY, USA, 2014. ACM. [2] Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel, Karthik Ramasamy, and Siddarth Taneja. 2015. Twitter Heron: Stream Processing at Scale. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 239-250. DOI=http://dx.doi.org/10.1145/2723372.2742788

[3] Zaharia, Matei, Xin, Reynold S., Wendell, Patrick, Das, Tathagata, Armbrust, Michael, Dave, Ankur, Meng, Xiangrui, Rosen, Josh, Venkataraman, Shivaram, Franklin, Michael J., Ghodsi, Ali, Gonzalez, Joseph, Shenker, Scott and Stoica, Ion. "Apache Spark: A Unified Engine for Big Data Processing." Communications of the ACM 59 , no. 11 (2016): 56–65.

[4] Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2.

[5] https://kubernetes.io/

16- Data acquisition and collection in harsh environment

Who? Name: Christelle Caillouet and David Coudert Mail: christelle.caillouet@inria.fr,

david.coudert@inria.fr Web page: http://www-sop.inria.fr/members/Christelle.Molle-Caillouet,

http://www-sop.inria.fr/members/David.Coudert

Where? Place of the project: Inria Address:

2004 route des Lucioles, 06903 Sophia Antipolis Team: Coati Web page:

https://team.inria.fr/coati/

Pre-requisites if any: Combinatorial optimization, Algorithmics,

Programming, Wireless networks Description:

In recent years, the deterioration of our pavements and bridges has accelerated due to the

ageing of structures, climatic changes and the increase of heavy vehicles. In order to measure

these degradations and anticipate the maintenance of road infrastructures, several initiatives

are being implemented. Through their joint project ROAD-AI, Inria and Cerema are jointly

studying digital tools for modeling these phenomena through structural instrumentation.

The subject of the internship combines the two following problems.

1. Data acquisition : Given application needs and technical constraints, the goal is to

determine the most suitable locations for each data acquisition. This study will take into account

the business needs but also the physical deployment constraints (accessibility, radio

environment, possibility of power supply or ambient energy recovery) and communication

protocols for the data collection. This will have an impact on the choice of the most suitable

means to collect the data according to the sensors locations, the necessary frequency and the

collection costs: radio collection, multi-hop, potential intervention of robots/drones, etc.

2. Data collection : In this part we will seek to design self-deployment techniques to allow each

entity of a fleet of autonomous robots (either ground or drones) to know where to move, how to

move, and this while keeping the connectivity with each other and carrying out the requested

task. These local and adaptive algorithms will take as input the application constraints of the

place in which the data must be collected (via pipes, at height under bridges, at different points

of a dam, etc.).

The goal of the internship is to design wireless communication protocols that will allow efficient

data collection with minimal energy consumption. Data collection is carried out in very restrictive

environments (e.g.: reduced accessibility, absence of electrical and communication networks,

etc.), or even harsh environments (e.g.: weather conditions, luminosity, humidity, etc.). The

sensors are all of different natures with potentially heterogeneous size, sensitivity and

communication means, providing heterogeneous data in size, type and acquisition frequency.

The energy consumption related to the transmission of data is indeed impacted by the choice of

the radio technology used, the amount of data transmitted, the transmission power (and

therefore the distances over which the data is transmitted).

The internship is part of the Défi Road-AI between Inria and Cerema (https://www.inria.fr/fr/road-

ai). It could be followed by a PhD on the topic with an associated grant. This internship will take

place in the Inria COATI research group located in Sophia Antipolis. The recruited person will

work closely with Dr. Christelle Caillouet and Dr David Coudert and will collaborate with Dr.

Nathalie Mitton from Inria Lille. The obtained results could be published as a research paper in

an international conference or journal.

17- Optimization of downlink communications in a LoRaWAN network

Who?

Name: CAILLOUET Christelle

Mail: christelle.caillouet@univ-cotedazur.fr

Telephone: 04 92 38 79 29

Web page: http://www-sop.inria.fr/members/Christelle.Molle-Caillouet/

Where?

Place of the project: Inria

Address: 2004 Route des Lucioles, Sophia Antipolis

Team: COATI

Web page: https://team.inria.fr/coati/

What?

This internship will take place in the Inria COATI research group located in

Sophia Antipolis. The recruited person will work closely with Dr. Christelle

Caillouet and will collaborate with Dr. Oana Iova, Prof. Alexandre Guitton and

Prof. Fabrice Valois.

Keywords: LoRa, LoRaWAN, Internet of Things (IoT), Low-Power Wide Area

Networks (LPWAN), downlink traffic, collisions, optimization, linear

programming

Context: Recent years have witnessed the surge of several new technologies

that enable long-range communication up to tens of kilometers with extremely

low-power consumption (18mA at 7dBm). These networks play a major role in

the Internet of Things (IoT), where they enable architectural alternatives with

degrees of scale and flexibility hitherto impossible. One of the most

representative technologies for long-range networks is LoRa [1], combined with

the LoRaWAN [2] protocol, a breakthrough technology for smart object data

collection that has gained global momentum.

Many research works have investigated the capacity and the performance of

LoRaWAN [3,4], but they always assumed that the uplink is fully independent of

the downlink. This assumption relies on the fact that the LoRa modulation

scheme of the uplink uses mainly upchirps, while the downlink uses mainly

downchirps. However, we have recently shown in an initial study that uplink and

downlink communications are not orthogonal [5].

Objective: The goal of this internship is to study the impact of the downlink

traffic on the capacity of the network. The intern will use a realistic model of

uplink and downlink communications in LoRaWAN that considers: collisions of

uplink frames sent simultaneously by nodes using the same parameters, the

half-duplex property of the gateway, and duty cycle constraints [7]. The intern

will have to propose solutions that will maximize the number of downlink frames

sent by the gateway, without negatively impacting the uplink communication.

Considering the extensive and innovative work on this topic, the obtained

results could be published as a research paper in an international conference

or journal.

Methodology: The intern will review existing work on the effect of downlink

traffic in LoRaWAN and more precisely the systematic literature review of [6].

The goal is to develop an optimization model using linear programming and

operation research tools. The model should be solved with optimal solutions

derived and carefully analyzed. A first model considering one gateway can be

then extended to a more general multi-gateway framework.

Bibliography:

[1] Semtech, LoRa - Long Range Technology: https://www.semtech.com/lora.

[2] LoRa Alliance, LoRaWAN 1.1 Specification, 2017.

[3] F. H. Khan and M. Portmann. Experimental evaluation of LoRaWAN in NS-3. In

International Telecommunication Networks and Applications Conference (ITNAC),

pages 1–8, 2018

[4] D. Magrin, M. Capuzzo, and A. Zanella. A thorough study of LoRaWAN

performance under different parameter settings. IEEE Internet of Things Journal,

7:116–127, 2020.

[5] R. Saroui, A. Guitton, O. Iova and F. Valois. “La rumeur disait faux : ils ne sont

pas orthogonaux !”, 7ème Rencontres Francophones sur la Conception de

protocoles, l’évaluation de performance et l’expérimentation des réseaux de

communication (CoReS), 2022.

[6] A. Jebril, R. Rashid, A systematic literature review on downlink frames in

LoRaWAN, Computers and Electrical Engineering 101 (2022)

[7] C. Caillouet, M. Heusse, F. Rousseau, Optimal SF Allocation in LoRaWAN

Considering Physical Capture and Imperfect Orthogonality, IEEE Globecom 2019

Required Skills:

Good background in mathematical programming and algorithmics, as well as

practical skills with programming languages (e.g., Java, Python) are welcomed.

French language is not mandatory but welcomed.

18- Integration validation of safety requirements in autonomous vehicles.

Who?

Name: Patricia Guitton (Renault Sofware Factory) and Mallet Frédéric

Mail:patricia.guitton-ouhamou@renault.com, Frederic.Mallet@univ-cotedazur.fr,

Telephone: 04 92 38 79 66

Web page: https://www.i3s.unice.fr/~fmallet/

Where?

Place of the project:Renault Software Factory Sophia-Antipolis

Address:2600 Rte des Crêtes, 06560 Valbonne

Team: Renault + Inria Kairos

Web page: https://www.renaultgroup.com/en/our-company/locations/software-labs-sophia-antipolis-2/

What?

Detailed description:

In the software development lifecycle, errors and flaws can be introduced in the different phases and lead to failures. Establishing a set of functional requirements helps produce safe software. However, ensuring that the developed software is compliant with those requirements is a challenging task due to the lack of automatic and formal means to conduct this verification. A first step in this work is driven by a phD student who formalizes requirements (defining grammar and syntax) in order to automatically transform those requirements into Finite State Machines (FSM). Those FSM are then verified with a model-checker. This step is to check that the software does what it is intended to do.

A second step is analyse the safety, meaning avoiding that the software will cause security issues on the driver or his passengers. This activity useful involves building safety requirements and identify failure roots as well as mechanisms to correct those failures.

Goal of the internship :

· Study the feasibility to replace UPPAAL model-checker by SCADE verification suite.

· Extend the requirements analysis and the set of properties to be considered, as, for example the performance and the verification of properties linked to cybersecurity.

From a natural language description, identify categories of requirements and classify into correctness, safety or cybersecurity. The example of the auto-parking assist system can be taken to build a demonstrator.

Automate the treatment, when possible, between the natural language requirements, the conversion into formal models, the analysis with verification tools and the feedback to the engineer.

This work could give a thesis opportunity.

References:

- On a Formal Model of Safe and Scalable Self-driving Cars, https://arxiv.org/pdf/1708.06374.pdf, 2017

- ISO 26262, https://en.wikipedia.org/wiki/ISO_26262

- SCADE 6 : https://ieeexplore.ieee.org/document/8285623

- Scénarios formels basés sur des règles pour la conception de véhicules autonomes sûrs, Joelle Abou Faysal, https://theses.hal.science/tel-03814686, 2022

19 - Study of large distributed storage systems.

Advisor: Frédéric Giroire and Stéphane Pérennes

Emails: frederic.giroire@inria.fr, stephane.perennes@inria.fr

Laboratory: COATI project - INRIA (2004, route des Lucioles – Sophia Antipolis) and the startup Hive (https://www.hivenet.com/)

Web Site:

http://www-sop.inria.fr/members/Frederic.Giroire/

Pre-requisites if any:

Basics in networking and probability and graph theory. Programming in python, C/C++ or java.

Description:

The internship will be done in collaboration with the startup Hive (https://www.hivenet.com/) and may be followed by a PhD for interested students.

Large scale peer-to-peer systems are foreseen as a way to provide highly reliable data storage at low cost. To ensure high durability and high resilience over a long period of time the system must add redundancy to the original data. It is well-known that erasure coding is a space efficient solution to obtain a high degree of fault-tolerance by distributing encoded fragments into different peers of the network. Therefore, a repair mechanism needs to cope with the dynamic and unreliable behavior of peers by continuously reconstructing the missing redundancy. Consequently, the system depends on many parameters that need to be well tuned, such as the redundancy factor, the placement policies, and the frequency of data repair. These parameters impact the amount of resources, such as the bandwidth usage and the storage space overhead that are required to achieve a desired level of reliability, i.e., probability of losing data.

In this internship, we will compair different repair policies and erasure codes. Indeed, some erasure codes (maximum distance separable (MDA) codes) such as Reed Solomon [41] have been shown to be optimal in terms of reception efficiency, i.e. the number of chunks required for reconstructing a lost chunk in our context. This means that they have an optimal storage space usage for a given number of tolerated failures for a distributed storage system. However, they are not very efficient in terms of bandwidth usage when a reconstruction has to be done. Indeed, the original data has to be fully reconstructed when a small chunk of data is lost to keep the redundancy of the system. Now, this operation happens constantly as disk failures are frequently happening in large distributed systems and peers may leave the system. As bandwidth is a crucial resource in distributed systems, alternative repair policies such as lazy reconstruction [2] or new codes such as hybrid codes [3], Hierarchical Codes [4], and regenerating codes [1] have been proposed to decrease the bandwidth used for repair. The later are near optimal in terms of bandwidth usage. However, this comes at a cost of much higher computational cost [5]. We will thus explore which codes present the best trade off between storage space, bandwidth usage, computational cost, number of tolerated failures and mean time to failure, data availability, and download speed.

References.

[1] Papailiopoulos, D. S., Luo, J., Dimakis, A. G., Huang, C., & Li, J. (2012, March). Simple regenerating codes: Network coding for cloud storage. In 2012 Proceedings IEEE INFOCOM (pp. 2801-2805). IEEE.

[2] Giroire, F., Monteiro, J., & Pérennes, S. (2010, December). Peer-to-peer storage systems: a practical guideline to be lazy. In 2010 IEEE Global Telecommunications Conference GLOBECOM 2010 (pp. 1-6). IEEE.

[3] Rodrigues, R., & Liskov, B. (2005, February). High availability in DHTs: Erasure coding vs. replication. In International Workshop on Peer-to-Peer Systems (pp. 226-239). Springer, Berlin, Heidelberg.

[4] Duminuco, A., & Biersack, E. (2008, September). Hierarchical codes: How to make erasure codes attractive for peer-to-peer storage systems. In 2008 Eighth International Conference on Peer-to-Peer Computing (pp. 89-98). IEEE.

[5] Duminuco, A., & Biersack, E. (2009, June). A practical study of regenerating codes for peer-to-peer backup systems. In 2009 29th IEEE International Conference on Distributed Computing Systems (pp. 376-384). IEEE