Internship List 2021-2022


1- On the (In)completeness of CCSL

Who?

Name: Frédéric MALLET

Mail: Frederic.Mallet@univ-cotedazur.fr

Telephone: 04 92 38 79 66

Web page: http://www.i3s.unice.fr/~fmallet/


Where?

Place of the project: EPC Kairos I3S/Inria

Address: Inria - Lagrange Building

Team: Kairos

Web page: http://team.inria.fr/kairos/


What?

Pre-requisites if any:

Detailed description:


The clock constraint specification language (CCSL) is a language to describe causal and temporal constraints for safety-critical applications. CCSL defines a set of patterns classical in real-time systems. While CCSL has been used in a large variety of domains, it was proposed to be incomplete. While completeness was not an objective in the creation of CCSL there are useful patterns that cannot be expressed with CCSL and that would be useful to define some advanced phenomena (n-m sharing, unbounded stack, ...). The goal of this internship is propose some extensions of CCSL to address some of the identified limitations. As always in this context, the extensions must come with the adequate (and preferably efficient) methods and tools to support the analysis through simulation or exhaustive verification.

The intern is expected to provide a description of his work, a contribution to TimeSquare to integrate his proposal and demonstrate its efficiency and possibly a first draft of a research paper to be submitted to a recognized event.


References: The important references are available there : https://timesquare.inria.fr/main-related-publications/ but at the very least the following ones give the key concepts to understand the challenges of this internship

Frédéric Mallet, Robert de Simone: Correctness issues on MARTE/CCSL constraints. Sci. Comput. Program. 106: 78-92 (2015). https://hal.inria.fr/hal-01257978

Grygoriy Zholtkevych, Maksym Labzhaniia: Understanding Safety Constraints Coalgebraically. http://ceur-ws.org/Vol-2732/20200029.pdf

TimeSquare : https://timesquare.inria.fr

Light CCSL : https://github.com/frederic-mallet/ccsl-sts


2- Program recognition with neural networks

> Who?

> Name: Pr Sid TOUATI

> Mail: Sid.Touati@inria.fr

> Web page: http://www-sop.inria.fr/members/Sid.Touati/


> Where?

> Place of the project: I3S and INRIA

> Team: COMRED and MDSC


> What?

> Pre-requisites if any: computer science: compilation, theoretical aspect of computer science,

graphs, C++


Description: See pdf at the end of the pwebpage.the



3- How Wi-Fi SSIDs and Bluetooth device names can be used to perform privacy attacks?

Who?

Name: Arnaud Legout

Mail: arnaud.legout@inria.fr

Telephone: +33 4 92 38 78 15

Web page: http://www-sop.inria.fr/members/Arnaud.Legout/


Where?

Place of the project: Inria Sophia Antipolis

Address: 2004 route des Lucioles

Team: DIANA

Web page: https://team.inria.fr/diana/


Pre-requisites if any: Python, machine learning is a plus

Description:

Wi-Fi Access points and Bluetooth devices broadcast their IDs tens of meters around them.

However, this information is considered public and can be measured by

apps or companies is order to build databases. Such ID have never been studied as

a way to express opinion. But recent research suggest that we can detect political

opinion from such ID. This is a huge privacy issue that is overlooked.


The ElectroSmart project collected during the past 5 years hundred of millions of Wi-Fi and Bluetooth IDs

worldwide. The internship student will have to analyze the kind of private information

that is disclosed by the IDs and correlate them with trends in social networks and news sources.

As a first step, the student will explore occurrence of terms related to elections and to

the Covid19 pandemic. Then student will correlate the temporal appearance of such terms

with social networks and news site to understand how such opinion propagate in Wi-Fi

and Bluetooth IDs.

In a second step, the student will explore how public information is social networks can

be exploited to infer the social identity of a user who sets the SSID to an identifying information.

The student will have the possibility to work with real world unique data

This internship can be continued by a Ph.D. thesis for excellent students.


4- Evolution over time of the structure of social graphs: Clustering


Who?

Name: Nicolas Nisse,Malgorzata Sulkowska

Mail: nicolas.nisse@inria.fr,

Web page: http://www-sop.inria.fr/members/Nicolas.Nisse/

Co-advisors : Frédéric Giroire (frederic.giroire@inria.fr) and Malgorzata Sulkowska (malgorzata.sulkowska@inria.fr)


Where?

Place of the project: Inria Sophia Antipolis

Address: 2004 route des Lucioles, 06902 Sophia Antipolis

Team: COATI

Web page: https://team.inria.fr/coati/


Pre-requisites if any:

Description: The goal of the project is to develop methods to analyse the evolution over time of a social network. We will consider as example the graph of scientific collaborations as it can be crawled freely.

The project will have two phases:

- Data collection. In the first phase, the student will use the available bibliographic research tools (SCOPUS, Web of Science, Patstat) to create data sets. One corresponding to the current situation and others corresponding to past moments. The data sets will correspond mainly to networks (annotated graphs) of scientific collaborations.

- Data analysis. In the 2nd phase, the student will analyse this data. First, they will focus on simple metrics (number of publications, number of patent applications...) and compare the evolutions across time. Then, if there is time, she will start studying the evolution of the structure of the network and will look at whether they are observing an evolution of its clustering due to the emergence of new collaborations.

The project will be part of a larger project on the evaluation of the impact of funding on scientific research. The project involve researchers in economics, sociology, and computer science.


Keywords: graph algorithms, big data, graph algorithms, network analysis


Useful Information/Bibliography: The project is the continuation of the internship whose report can be found here: http://www-sop.inria.fr/members/Nicolas.Nisse/ReportsStudents/Shelest21.pdf





5- Using semantic graph clustering to explore study and career paths with the goal to help student and professional counselling.


Who?

Name: Nicolas Nisse and Thibaud Trolliet (startup MillionRoads)

Mail: nicolas.nisse@inria.fr

Web page: http://www-sop.inria.fr/members/Nicolas.Nisse/

Co-advisors : Frédéric Giroire (frederic.giroire@inria.fr)


Where?

Place of the project: Inria Sophia Antipolis

Address: 2004 route des Lucioles, 06902 Sophia Antipolis

Team: COATI

Web page: https://team.inria.fr/coati/


Description:

The internship will be part of a larger project involving the Inria team COATI and a startup, MillionRoads. The goal of the project is to build digital tools to help students and professionals making the right choices in their study and career orientations. Career guidance is commonly recognized by governments and institutions as a key to good professional and social integration, reduction of unemployment, and inequality.

In particular, study and career changes are very frequent among students and may be the sign of or may lead to important difficulties [Gati et al (2019)] . It is thus crucial to help the students either to avoid them by guiding them to a first adequate career choice or to help them when choosing their new path [Masdonati (2017)].

We plan to use learning techniques and graph algorithms to detect course breaks, in parallel or in addition to a semantic study. The method is to use a semantic graph to represent the millions of career paths retrieved by the startup. We will then use graph clustering for the detection. Similar paths (for example, Bac S, preparatory classes, Engineering Diploma, Engineering Work) will generate many edges between similar nodes of the graph. These nodes, which are thus closely connected to each other, will therefore form clusters. Detecting breaks in the path could therefore be achieved by (i) calculating the clusters of the graph (ii) detecting the paths that change clusters. We will do a semantic study of the clusters and of the detected paths.

Scientific stakes and challenges. The career path graph is a very large graph, with several hundred million vertices (either path step nodes or semantic nodes). Such a graph is already complicated to even efficiently store in memory. Most clustering algorithms, such as the Louvain method [Louvain method 2008] which is one of the most widely used in practice, can hardly handle more than a few hundred thousand vertices. It will therefore be crucial to find an adequate method for our graph. There are statistical mechanics methods based on label propagation that are appropriate for some large graphs [Statistical mechanics 2006, Community detection 2010]. Other techniques to allow the computation of algorithms in very large graphs is to decompose the graph recursively into sub-graphs and to apply clustering methods in these sub-graphs and then recompose them together [Distributed Louvain 2018].


Keywords : graph algorithms, big data, graph algorithms, network analysis, natural language processing, semantics


References : Vincent A. Traag, Ludo Waltman, Nees Jan van Eck:

From Louvain to Leiden: guaranteeing well-connected communities. CoRR abs/1810.08473 (2018)



6- Online Algorithms with Predictions


#SUPERVISORS

Name: Giovanni Neglia, Eitan Altman, and Tareq Si Salem

Mail: {firstname.familyname}@inria.fr

Telephone:

Web page: www-sop.inria.fr/members/Giovanni.Neglia/


Where?

Place of the project: Inria

Address: 2004 route des Lucioles

Team: NEO

Web page: https://team.inria.fr/neo/


#PRE-REQUISITES

The student should have good analytical skills, a solid knowledge of algorithms, and basic programming skills in Python, C or Java. A background on optimization is a plus.


#DESCRIPTION

A recent trend in networking is to apply online convex optimization [1] to design online algorithms with regret guarantees against an adversary which may arbitrarily select the input sequence. The regret is defined as the difference between the costs experienced---over the time horizon of interest T---by the online algorithm and by the optimal static solution with hindsight. If the algorithm's regret grows sublinearly with T, then the time average cost experienced by the online algorithm coincides with the cost of the optimal static solution and the algorithm is said to have no-regret. Online no-regret algorithms have been applied with success to caching problems [2,3].

Some no-regret algorithms can also exploit available predictions about the future input sequence (e.g., the future content requests in a caching network); such prediction may be provided, for example, by machine learning models trained on historical data. In the best cases, these algorithms enjoy the same regret achievable in the absence of any prediction when predictions are unreliable, and smaller regret, as the quality of the predictions improve.

The goal of this internship is to adapt a specific no-regret online algorithm (Follow-the-Regularized-Leader [4]) to caching problems. The students will need to implement and test the algorithm, but also to work on deriving the regret guarantees of the proposed caching algorithm.

[1] E. Hazan, “Introduction to online convex optimization,” Found. Trends Optim., vol. 2, no. 3–4, p. 157–325, Aug. 2016.

[2] G. S. Paschos, A. Destounis, L. Vigneri, and G. Iosifidis, “Learning to cache with no regrets,” in IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, 2019, pp. 235–243.

[3] T. Si Salem, G.Neglia, and S.Ioannidis,“No-Regret Caching via Online Mirror Descent,” in Proc. of IEEE ICC, 2021

[4] H. B. McMahan, “A survey of algorithms and analysis for adaptive online learning,” J. of Machine Learning Res., vol. 18, pp. 1–50, 2017.


7- Federated Learning for IoT Devices


#SUPERVISORS

Name: Giovanni Neglia, Alain Jean-Marie, Othmane Marfoq

Mail: {firstname.familyname}@inria.fr

Telephone:

Web page: www-sop.inria.fr/members/Giovanni.Neglia/


Where?

Place of the project: Inria

Address: 2004 route des Lucioles

Team: NEO

Web page: https://team.inria.fr/neo/


#PRE-REQUISITES

The student should have good programming skills in Python to be able to

run experiments with PyTorch. The student should also have good

analytical skills and enjoy mathematical reasoning along the lines of

the course "Machine Learning: Theory and Algorithms."


# DESCRIPTION

Federated learning (FL), “involves training statistical models over

remote devices or siloed data centers, such as mobile phones or

hospitals, while keeping data localized” [1] because of privacy concerns

or limited communication resources. FL is at the core of many machine

learning models running on smartphones like Google keyboard [2]. FL

algorithms (e.g., FedAvg [3]) train a common machine learning model

through multiple rounds: at each round a central orchestrator sends the

current model to (a subset of) the clients, each client updates the

model using its own local dataset to compute a stochastic gradient and

then sends the updated model to the orchestrator, where all updates are

aggregated.

Most of the FL algorithms consider that the client's local dataset does

not change over time. In the case of IoT devices, new data is

continuously generated and the device may be able to store only a part

of it. As a result, some devices with high data generation rates and/or

low storage capacity may have completely different datasets from one

round to the other, while others may have their local datasets (almost)

unchanged. Intuitively, when the orchestrator aggregates the updates, it

should give higher weight to devices with more fresh data, than to

devices reusing old data.

The first goal of this internship is to overview related work on FL

[4,5,6], combinations of online and batch learning [7,8,9], and the

general framework of online to batch conversion [10, chapter 5]. The

student can find useful the following surveys on online learning [11,12]

and on FL [1,13]. The second goal is to propose and test some heuristics

for FL in the scenario of interest using a PyTorch framework developed

in NEO team. Finally, the third goal is to derive generalization bounds

for the proposed algorithms.

# REFERENCES

[1] Tian Li, et al. "Federated learning: Challenges, methods, and

future directions." IEEE Signal Processing Magazine 37.3 (2020): 50-60.

[2] Andrew Hard et al. "Federated Learning for Mobile Keyboard

Prediction." arXiv preprint arXiv:1811.03604 (2018).

[3] Jakub Konečný, et al. "Federated optimization: Distributed machine

learning for on-device intelligence." arXiv preprint arXiv:1610.02527

(2016)

[4] Fernando Casado et al. "Federated and continual learning for

classification tasks in a society of devices." arXiv preprint

arxiv:2006.07129 (2020).


[5] Anastasiia Usmanova et al. "A distillation-based approach

integrating continual learning and federated learning for pervasive

services." arXiv preprint arXiv:2109.04197 (2021).

[6] Sannara Ek et al "Evaluation of Federated Learning Aggregation

Algorithms Application to Human Activity Recognition." UbiComp-ISWC '20:

[7] A. Agarwal et al. "A Reliable Effective Terascale Linear Learning

System." In: J Mach Learn Res 15.1 (Jan. 2014), pp. 1111–113

[8] O. Chapelle et al. "Simple and Scalable Response Prediction for

Display Advertising." In: ACM Trans Intell Syst Technol 5.4 (Dec. 2014),

61:1–61:34.

[9] H. B. McMahan et al. "Ad Click Prediction: A View from the

Trenches." In: Proceedings of the 19th ACM SIGKDD International

Conference on Knowledge Discovery and Data Mining. KDD ’13. 2013, pp.

1222–1230.

[10] Shai Shalev-Schwartz "Online Learning and Online Convex

Optimization." Foundations and Trends in Machine Learning Vol. 4, No. 2

(2011) 107–194

[11] H. Brendan McMahan "A Survey of Algorithms and Analysis for

Adaptive Online Learning."


[12] Steven C.H. Hoi et al. "Online learning: A comprehensive survey."

Neurocomputing, Volume 459, 2021, Pages 249-289


[13] Peter Kairouz et al. "Advances and Open Problems in Federated

Learning." Foundations and Trends® in Machine Learning, Vol 14, Issue 1–2


8- Large scale study of the digital footprint of Internet Users and Internet Streaming Applications

Advisors:

Dino Lopez Pacheco <dino.lopez@univ-cotedazur.fr>, http://www.i3s.unice.fr/~lopezpac/

Guillaume Urvoy-Keller <urvoy@univ-cotedazur.fr>, http://www.i3s.unice.fr/~urvoy/

Place of the project: I3S Lab

Address: 2000, route des Lucioles 06903 Biot

Team: SigNet

Web page: http://signet.i3s.unice.fr/


Description :

The digital footprint of the Internet (including data centers and intranets) from the cloud to the end user, with all intermediate ISP networks represents 4% of greenhouse gas emission (GHG) of the world [Andrae]. Forecasts for the following years estimate that it could reach 8%, which is equivalent to the total worldwide road traffic.

In France, think-tanks like the Shift Project [Shift1] and official organization like ARCEP [ARCEP] have tackled this issue, producing reports, e.g. on the crucial impact of video traffic [Shift2].

In the SigNet team, we aim at evaluating the digital footprint of end users. We have devised a tool, following the idea of the Carbonalyzer [Carbo] Web plugin (also a mobile app). The current version of the tool works on Unix machines. It enables to capture, with a minimal impact on user experience, his or her traffic and determine the networks (notional, European, International) that conveyed this traffic. The amount of bytes is then transformed for each network in each country in a GHG emission as a function of the way electricity is produced in the specific country. Summary results are then consolidated and sent regularly on a centralized server.

The objectives of this Master internship are to:

* Get familiar with the state of the art of the domain and the tool already developed at I3S.

* Develop a Windows version of the client;

* Extend the server-side data visualization capabilities of the tool. As we envision to recruit a large community of users, we need to have an automatic method for user registration and a way for the user to visualize its GHG footprint on the server through a dedicated Web page.

* Refine the energetic models that relate the number of exchanged bytes to the electrical consumption [Coroama];

* Carry longitudinal studies of specific users and mining the results for specific streaming applications that are heavily used in the current health context. Indeed, each service (Zoom with or without institutional access, BBB, Jitsie, etc) will have a GHG footprint that differs depending on its network efficiency (throughput of codec) and the location of its servers.

Pre-requisites: Python and client server programming. Computer networks.


References:

[Andrae] Andrae, Anders SG, and Tomas Edler. "On global electricity usage of communication technology: trends to 2030." /Challenges/ 6.1 (2015): 117-157.

[Shift1] https://theshiftproject.org/article/pour-une-sobriete-numerique-rapport-shift/

[Shift2] https://theshiftproject.org/article/climat-insoutenable-usage-video/

[ARCEP] https://www.arcep.fr/uploads/tx_gspublication/reseaux-du-futur-empreinte-carbone-numerique-juillet2019.pdf

[Carbo] https://theshiftproject.org/carbonalyser-extension-navigateur/

[Coroama] Coroama, Vlad C., and Lorenz M. Hilty. "Assessing Internet energy intensity: A review of methods and results." Environmental impact assessment review 45 (2014): 63-68.



9- Leveraging the wealth of data available in the browser for network monitoring and troubleshooting


Who?

Name: Chadi Barakat (Inria, Diana project-team) and Yassine Hadjadj-Aoul (Inria, Dionysos project-team)

Mail: Chadi.Barakat@inria.fr, yhadjadj@irisa.fr

Telephone: +33 (0) 4 92 38 75 96, +33 (0) 2 99 84 71 35

Web page: https://team.inria.fr/diana/chadi/, http://people.irisa.fr/Yassine.Hadjadj-Aoul/


Where?

Place of the project: Inria Sophia Antipolis

Address: 2004, route des Lucioles, 06902 Sophia Antipolis, France

Team: Diana

Web page: https://team.inria.fr/diana/


What?

Pre-requisites if any: Standard knowledge in web and network programming


Detailed description:


Context - Despite the considerable improvement in terms of internet access performance and the quality of the physical and virtualised infrastructures hosting the internet services, we are still facing situations where the internet service degrades and the end user Quality of Experience (QoE) is lower than expected. The reasons are many, from the slowness of the device of the user, to the bad configuration of the WiFi at home, to the interference caused by the neighbouring WiFi networks, to the saturation of the access link by the many devices and applications running at home, to the congestion in the ISP network especially on its peering links, till the overload of the servers of the content providers following a sudden increase in the users’ activities. There are also situations where the QoE degrades for other reasons than congestion or lack of resources, as when the ISP or the content provider decide to reduce the quality of their service to prioritise some part of the traffic over the rest a.k.a. network traffic differentiation, or to face scenarios of heavy service usage (video resolution reduction by major video stream platforms during confinement period). Those situations, and many others, exist well today and will not be solved in the near future despite the considerable advances seen and foreseen both at the network and the cloud levels. The problem is not only in the frustration they cause for the end user, but also in the difficulty for the end user to distinguish between them and to take the appropriate actions to counter their origins in the limit of possible.

A long list of solutions and tools have been proposed over the years to shed light on some of these problems, e.g., SpeedTest, MobiPerf, RTR - NetTest, ACQUA, WiFi Analyzer of MS Windows, WiFi Scanner of Apple, QoE Doctor. These tools and many others contribute to answering the questions of the end user and the network and service providers, but they are on one side limited to the specific problem to which they are designed, thus requiring the user to install and master them, and second, they are in an important part of them intrusive and requiring the installation of a list of tools, and the injection of traffic into the network thus causing overload on a network which is already loaded at the moment of the problem.

In this project, we want to explore a new approach consisting in leveraging the wealth of data available in the end user device as a result of her/his normal activity. These data, collected at almost no cost, is shaped according to what is going inside the network, in the device of the user, and on the other side of the service provider. For example, congestion will result in an increase in the packet delay, the loss rate of packets and a drop in the network throughput. Saturation of the device shows a high CPU load and/or memory usage, whereas the saturation of the server on the other side results in part of the traffic exchanged with this server considerably delayed whereas the rest of the traffic behaving normally. All these signatures and others exist together, and the challenge is in identifying and isolating them from each other and in building appropriate classifiers using machine learning techniques in an effort to understand the origins behind any problem causing the drop in the internet service performance and the end-user Quality of Experience.

Internship objectives - This internship will address this problem by focusing on web browsing as a service and leveraging the wealth of measurements that can be collected from within the browser while browsing the Web. A long list of these measurements is available in our browser as the time to connect to the server, the time to download the DOM (Data Object Model) describing the page, the time to load the page, and so on. Information on the web page itself is also available as the number of objects and the size of each object. In addition, other information regarding the machine itself as its CPU load, memory usage, and network connection technology can also be available. We believe all this information studied together can help shedding light on the performance of the underlying network and pointing to the origins of any service degradation.

The internship will start by reviewing the state-of-the art in the domain and establishing the list of measurements that can be collected from within the browser, and the information that can be extracted from these measurements. In a first phase, the candidate will set up a web extension to collect these measurements in the wild and to categorize them based on the aspects they cover (device, network, etc.). Based on this web extension, the candidate will run experiments to crawl the web under controlled network conditions where we know the ground truth about the network performance. The next phase of the internship will then consist in analyzing the obtained data towards a solid understanding of the link that exists between network performance and browse-level measurements. With this analysis, we aim at: (i) building estimators for network performance from browser-level measurements, (ii) identifying the most relevant browser-level measurements to consider for the purpose of network performance monitoring, and (iii) in case of network performance problems, build classifiers able to estimate the type of the encountered problem from the observed behavior at the browser level. Our final objective from this work is thus to prove the feasibility of the data-driven network monitoring approach over the web case, and to transform the web extension into a light-weight browser-level non-intrusive network monitoring and troubleshooting tool.

References:

1- Imane Taibi, Yassine Hadjadj-Aoul, Chadi Barakat, “Data Driven Network Performance Inference From Within The Browser“, in proceedings of the 12th IEEE Workshop on Performance Evaluation of Communications in Distributed Systems and Web based Service Architectures (PEDISWESA), Rennes, July 2020.

2- Imane Taibi, Yassine Hadjadj-Aoul, Chadi Barakat, “When Deep Learning meets Web Measurements to infer Network Performance“, in proceedings of the IEEE 17th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, January 2020.

3- Mohan Dhawan, Justin Samuel, Renata Teixeira, Christian Kreibich, Mark Allman, Nicholas Weaver, and Vern Paxson. 2012. Fathom: a browser-based network measurement platform. In Proceedings of the 2012 Internet Measurement Conference (IMC '12). Association for Computing Machinery, New York, NY, USA, 73–86.

4- A. Huet, Z. B. Houidi, B. Mathieu and D. Rossi, "Detecting Degradation of Web Browsing Quality of Experience," 2020 16th International Conference on Network and Service Management (CNSM), 2020.

5- Enrico Bocchi, Luca De Cicco, and Dario Rossi. 2016. Measuring the Quality of Experience of Web users. SIGCOMM Comput. Commun. Rev. 46, 4 (October 2016).

6- Ashkan Nikravesh, Hongyi Yao, Shichang Xu, David Choffnes, and Z. Morley Mao. 2015. Mobilyzer: An Open Platform for Controllable Mobile Network Measurements. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys '15).

7- Loïck Bonniot, Christoph Neumann, and François Taïani. 2020. DiagSys: network and third-party web-service monitoring from the browser's perspective (industry track). In Proceedings of the 21st International Middleware Conference Industrial Track (Middleware '20).


10- Coud we get rid of 5G with wifi?


Who?

Name: Arnaud Legout / Damien Saucez

Mail:arnaud.legout@inria.fr /damien.saucez@inria.fr

Telephone: +33 4 92 38 78 15 / +33 4 89 73 24 18

Web page:http://www-sop.inria.fr/members/Arnaud.Legout/

https://team.inria.fr/diana/team-members/damien-saucez/


Where?

Place of the project: Inria Sophia Antipolis

Address: 2004 route des Lucioles

Team: DIANA

Web page:https://team.inria.fr/diana/

Pre-requisites if any: Python, machine learning is a plus, layer 1 and layer 2 protocols knowledge is a plus.



Description:

More than ever we are in need for continuous high bandwidth - low latency internet connectivity, even when we are on the move. 5G has been proposed to offer such great connectivity. Unfortunately the adoption of the technology takes time as it requires operators to deploy new antennas and end users to buy new equipment.

In this internship we will understand if in dense urban areas the current WiFi deployments could be used instead of 5G to allow faster deployment of high quality wireless networks. It is certain that in the general case WiFi is not the right technology to achieve what cellular networks propose and more specifically what 5G promises. However, in high density urban areas it is believed that WiFi access points are everywhere and could maybe provide great connectivity without requiring new expensive deployments and investments in new end user equipment.

The work opens many questions, both from a technological side and on the social side.

From a technological standpoint we have to determine how WiFi and cellular signals compare since they have been designed for contrasting objectives. If salient features are not directly available in WiFi we will propose solutions to improve WiFi such that it could be a solid competitor against 5G, at least in dense areas.

From a social point of view we have to validate our claim that densely populated areas provide WiFi coverage. To answer this question we have the chance to have a full access to the ElectroSmart project dataset.The ElectroSmart project continuously collected during the past 5 years hundreds of millions of Wi-Fi and cellular signals worldwide. Having access to this unique dataset gives the opportunity to understand what kind of WiFi and cellular signals people really encounter throughout their day. Even though the structure of the ElectroSmart dataset is relatively simple, its scale and the potential noise it contains make it particularly challenging to work on, particularly when information must be combined to answer complex questions such as the what we want to address with this internship.

Ultimately, this internship offers the opportunity for the student to work on low level networking aspects (L1 and L2), to crunch a large real world unique dataset analysis with complex queries, but also to understand social behaviour to see how current technology can be leveraged to fit with real life movements of users.


11- Quantum Entanglement Routing

Advisor: APARICIO PARDO Ramon , Frédéric GIROIRE

Mail: raparicio@i3s.unice.fr

Telephone: 04 92 38 77 72

Web page: http://www.i3s.unice.fr/~raparicio/


Place: Centre INRIA d’Université Côte d'Azur

Address: 2004 Route des Lucioles, 06902 Valbonne

Team: COATI/SIGNET

Web page: https://team.inria.fr/coati/

http://signet.i3s.unice.fr


Pre-requisites if any:

Languages:

- Python language absolutely

- Deep Learning libraries (like TensorFlow [6], Keras, rllab, OpenAI Gym)

Theory:

- Machine Learning, Data Science, particularly Neural Networks theory

- Classical optimisation theory (Linear Programming, Dual Optimisation, Gradient Optimisation, Combinatorial Optimization)

Technology:

- Computer networking notions

- Quantum information (not mandatory but it helps)


Description:

In the long term, Quantum Communications promise to connect Quantum Processors placed at remote locations, giving rise to Quantum Cloud able to perform very complicated computation tasks in very shorter processing times. In the short term, Quantum Communications are applied in tasks such as cryptography key distribution or clock synchronization [1]. In both cases, the basic “operation” necessary to carry out as first step is “to quantum entangle” the end nodes of the communication. To do this, first, we need to find a sequence of links (a path) connecting the end nodes; second, to entangle the two nodes connected by each link; finally, entangle the end-to-end path. Unfortunately, this is a probabilistic process whose result cannot be foreseen. In this project, we aim to study this so-called Quantum Entanglement Routing problem.

In a first term, we will review and [2-5] identify the most relevant entanglement routing algorithms to compare them. In a second term, we will develop our own proposal to tackle this problem.


Gratification

- 600-euro monthly, 6 months


Outcomes

- A publication plus a repository with the code.


Useful Information/Bibliography:

[1] "Quantum Networks: From a Physics Experiment to a Quantum Network System" with Stephanie Wehner

: https://www.youtube.com/watch?v=yD193ZPjMFE

[2] M. Pant et al., “Routing entanglement in the quantum internet,” npj Quantum Inf, vol. 5, no. 1, pp. 1–9, Mar. 2019, doi: 10.1038/s41534-019-0139-x.

[3] K. Chakraborty, F. Rozpedek, A. Dahlberg, and S. Wehner, “Distributed Routing in a Quantum Internet,” arXiv:1907.11630 [quant-ph], Jul. 2019, Accessed: Sep. 16, 2021. [Online]. Available: http://arxiv.org/abs/1907.11630

[4] S. Shi and C. Qian, “Concurrent Entanglement Routing for Quantum Networks: Model and Designs,” in Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, New York, NY, USA, Jul. 2020, pp. 62–75. doi: 10.1145/3387514.3405853.

[5] C. Li, T. Li, Y.-X. Liu, and P. Cappellaro, “Effective routing design for remote entanglement generation on quantum networks,” npj Quantum Inf, vol. 7, no. 1, pp. 1–12, Jan. 2021, doi: 10.1038/s41534-020-00344-4.



12- Modelling topological heterogeneous systems.

Who?

Name: Eric Madelaine

Mail: eric.madelaine@inria.fr

Telephone: 0787479980

Web page: http://www-sop.inria.fr/oasis/Eric.Madelaine/


Where?

Place of the project: INRIA

Address: 2004 rte des Lucioles, Sophia-Antipolis

Team: Kairos

Web page: https://team.inria.fr/kairos/


What?

Pre-requisites if any:

Development with Eclipse Framework, Taste for formal approaches.


Detailed description: indicate the context of the work, what is

expected from the intern, what will be the outcome (software, publication, ...).

1) Context


As the impact of heterogeneous computing is rapidly increasing,

specialized environments and tools are needed to efficiently evaluate

developed applications. In order to analyze and to guarantee the proper

functioning of a distributed application in a heterogeneous cloud

infrastructure, we propose an environment for early architecture

design with mathematical models called pNets.


In a previous version of our software, we developed methods and tools upon

pNets which are finite behavioral models for the description of

distributed applications, communicating asynchronously. These allow the

analysis of complex systems, with parametric topology, typically rings,

pipelines or process trees, and their implementation both at software

architecture description languages level and at verification platform

level. pNets are a generalization of automata that model open systems.

They feature hierarchical composition, allowing for significant

reduction of state spaces.

More recently we developed, a new methodology extending this work:

- by using symbolic models for explicit data manipulation, to overcome the

finiteness limit constraints.

- by allowing the description of open systems to enable compositional

verification methods

Despite this extension, simply expressing parametrized topologies

(vectors, rows, rings, matrices, etc) that are very useful in real

applications is still very difficult. Expressing them requires to

concretely extend the pNets model itself, by introducing a notion of

topological parameters and spatial structures. Doing so requires enriching

the automata description formalism, but also the synchronization

mechanism.


2 Objectives of the Internship

The goal of this internship is precisely to design and define this

revision to deal with parameterized topologies. Parametrized structures

would also be hierarchical structures whose synchronization mechanisms

carry the necessary information to combine and synchronize different

elements of the structure, typically through logical configurations.

More specifically their behavioral semantics would be expressed in

terms of open automata.

The first expected concrete step will be to design a syntax allowing to

define the relations (communication, synchronization) between the

elements of the system. A second step will be to bring the designed

formalism to code up, and determine the impact of these extensions on

the notions of equivalence and model-checking of pNets. Finally, we will

evaluate our design on examples and simple use-cases.


References: set of bibliographical references (article, books, white papers, etc) to be read by the student before starting to work on this subject

[Rouini2010] rapport de stage mastere Ubinet master report, Amine Rouini, 2010

Parametric Component Topologies: language extension and implementation

https://www.dropbox.com/s/womaqldgm9os838/Report-AmineRouini-Ubinet2010.pdf?dl=0

[Forte'16] A Theory for the Composition of Concurrent Processes -Extended version

Ludovic Henrio, Eric Madelaine, Min Zhang

https://hal.inria.fr/hal-01299562v1

[ScienceChina'16] Towards a bisimulation theory for open synchronized networks of automata

Eric Madelaine, Min Zhang

https://hal.inria.fr/hal-01417652


13- Network simulator for reinforcement learning


Advisor: APARICIO PARDO Ramon , Hicham Lesfari

Mail: raparicio@i3s.unice.fr

Telephone: 04 92 38 77 72

Web page: http://www.i3s.unice.fr/~raparicio/


Place: Centre INRIA d’Université Côte d'Azur

Address: 2004 Route des Lucioles, 06902 Valbonne

Team: COATI/SIGNET

Web page: https://team.inria.fr/coati/

http://signet.i3s.unice.fr


Pre-requisites if any:

- An object oriented language, ideally C ++, but knowledge of java can be also sufficient.

- Python language (recommended)

- Notions of networks (and their protocols)

- Notions of computer simulations, in particular discrete event simulations.


Description:

In reinforcement learning [1], the training process is guided by interactions with an environment. The mechanics are simple. During training, the learning algorithm makes a decision which is sent to the environment. This, in turn, processes the decision, changing the internal state of the environment and providing to the algorithm an assessment of the quality of the decision (a reward). The environment could be a real system (a robotic arm) or a simulation of that system (a simulation of a robotic arm). Within the framework of this internship, the target environment is a network of computers sending packets to each other. Since you are not a telecommunications network operator, you are forced to rely on simulation.

The objective of this internship is to design a packet network simulator that can be used as an environment for reinforcement learning.

The intended platform and library for this task is the NS3 network simulator [2], which is programmed in C++. NS3 has two modules, “ns3-gym” [3-4] and “ns3-ai” [5-6], allowing the integration between the NS3 network simulator and “OpenAI Gym” [7-8], a Python popular reinforcement learning toolkit.


Gratification

- 600-euro monthly, 6 months

Outcomes

- A publication plus a repository with the code.

Useful Information/Bibliography:


[1] Lil Log, Reinforcement Learning: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html

[2] NS-3 Simulator. [Online]. Available in: https://www.nsnam.org/. Accessed on: 26th Nov. 2021.

[3] Piotr Gawłowicz and Anatolij Zubow. 2019. Ns-3 meets OpenAI Gym: The Playground for Machine Learning in Networking Research. In Proceedings of the 22nd International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWIM '19). Association for Computing Machinery, New York, NY, USA, 113–120

[4] ns3-gym module. [Online]. Available in: https://apps.nsnam.org/app/ns3-gym/. Accessed on: 26th Nov. 2021.

[5] Hao Yin, Pengyu Liu, Keshu Liu, Liu Cao, Lytianyang Zhang, Yayu Gao, and Xiaojun Hei. 2020. Ns3-ai: Fostering Artificial Intelligence Algorithms for Networking Research. In Proceedings of the 2020 Workshop on ns-3 (WNS3 2020). Association for Computing Machinery, New York, NY, USA, 57–64.


[6] ns3-ai module. [Online]. Available in: https://apps.nsnam.org/app/ns3-ai/. Accessed on: 26th Nov. 2021.

[7] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. CoRR (2016).


[8] OpenAI Gym. [Online]. Available in: https://gym.openai.com/. Accessed on: 26th Nov. 2021.





14- Deep Reinforcement Learning for Video caching


Advisor: APARICIO PARDO Ramon, Stéphane Pérennes

Mail: raparicio@i3s.unice.fr

Telephone: 04 92 38 77 72


Web page: http://www.i3s.unice.fr/~raparicio/


Place: Centre INRIA d’Université Côte d'Azur

Address: 2004 Route des Lucioles, 06902 Valbonne

Team: COATI/SIGNET

Web page: https://team.inria.fr/coati/

http://signet.i3s.unice.fr


Pre-requisites if any:

Languages:

- Python language absolutely

- Deep Learning libraries (like TensorFlow [6], Keras, rllab, OpenAI Gym) appreciated

Theory:

- Machine Learning, Data Science, particularly Neural Networks theory very recommendable

- Classical optimisation theory (Linear Programming, Dual Optimisation, Gradient Optimisation, Combinatorial Optimization) appreciated

Technology:

- Computer networking notions are welcome, but they are not necessary.


Description:

The application of novel techniques of machine learning, such as deep reinforcement learning [1], has gained attention of computer network community in the last years [2]. One of the problems that has been addressed is the caching of video contents in locations close to the users [3]. The caching has a twofold objective: to improve the experience of all the users regardless their location and to reduce the traffic load at the backbone networks.

Dai et al. [4] et Mittal et al [5] has shown the interest of deep reinforcement learning to learn heuristic algorithms to solve some classical NP-hard problems on graphs by combining Reinforcement Learning (RL) with Graph Embedding (GE) [6], a kind of representation learning applied to graphs. GE obtains a more compacted and lower dimensional graph representation where the RL scheme can solve easier the optimization problem. The work of Mittal et al [5] is particularly interesting because it addresses the cover set problem using this approach by proposing the GCOMB algorithm. Caching decisions can be reformulated as cover set problems.


Steps:

Phase 1: Getting familiar with the bibliography and the code of GCOMB algorithm [7]

Phase 2: To prepare the dataset to be employed: Trending YouTube Video Statistics [8]

Phase 3: Apply the GCOMB algorithm to this dataset to solve the caching problem

Phase 4: Benchmark GCOMB with other classical cover set algorithms to solve this caching problem.


Goal:

To apply the Mittal [5] approach to select the “best” videos to cache using as input the local distribution of the content popularity.

Gratification

- 600-euro monthly, 6 months

Outcomes

- A publication plus a repository with the code.


Useful Information/Bibliography:

[1] Lil Log, Reinforcement Learning: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html

[2] N. Vesselinova, R. Steinert, D. F. Perez-Ramirez and M. Boman, "Learning Combinatorial Optimization on Graphs: A Survey With Applications to Networking," in IEEE Access, vol. 8, pp. 120388-120416, 2020, doi: 10.1109/ACCESS.2020.3004964. [Online]: https://arxiv.org/abs/2005.11081

[3] Y. Wang and V. Friderikos, “A Survey of Deep Learning for Data Caching in Edge Network,” Informatics, vol. 7, no. 4, p. 43, Oct. 2020. [Online]: https://www.mdpi.com/2227-9709/7/4/43

[4] H.Dai, E.Khalil,Y.Zhang,B.Dilkina,andL.Song,“Learning combinatorial optimization algorithms over graphs,” in Proc. Advances in Neural Inf. Process. Syst. (NeurIPS), 2017, pp. 6348–6358. [Online]: https://arxiv.org/abs/1704.01665

[5] A.Mittal, A.Dhawan, S.Manchanda, S.Medya, S.Ranu,andA.Singh, “Learning heuristics over large graphs via deep reinforcement learning,” 2019. [Online]. Available: arXiv:1903.03332

[6] W. L. Hamilton, R. Ying and J. Leskovec. Representation Learning on Graphs: Methods and Applications. arXiv:1709.05584, Apr. 2018.

[7] GCOMB code: https://github.com/idea-iitd/GCOMB

[8] Trending YouTube Video Statistics: https://www.kaggle.com/datasnaek/youtube-new


15- Impact of Heavy Memory Overbooking on VMs

Advisors:

Dino Lopez Pacheco <dino.lopez@univ-cotedazur.fr>, http://www.i3s.unice.fr/~lopezpac/

Guillaume Urvoy-Keller <urvoy@univ-cotedazur.fr>, http://www.i3s.unice.fr/~urvoy/

Place of the project: I3S Lab

Address: 2000, route des Lucioles 06903 Biot

Team: SigNet

Web page: http://signet.i3s.unice.fr/

Description

Introduction

Virtualization is the main building block in current Data Centers. By deploying multiple virtual machines (VMs), virtualization optimizes the utilization of physical resources in servers with large capacities. Sharing some resources like Network Interface Cards or CPU between multiple virtual instances does not create major problems, e.g. one single core can easily be shared between several VMs as servers usually run between 5-10% of CPU utilization and NICs frequently feature multiple ports with bandwidth larger than 1Gbps per port.

However, sharing the memory is a pretty challenging task. Indeed, sharing one single region of memory between several VMs will need the execution of eviction-like solutions to replace the memory content from one VM with the data generated by another one. Then store somewhere else the evicted data.

Hence, a region of memory is frequently only assigned to one single VM. Since most Operating Systems need at least 1GB of memory, and due to the large numbers of VMs that need to be deployed, memory is a scarce resource in Data Centers.


Memory reclaim techniques

To optimize the memory utilization, VMs can be smartly placed to minimize the portion of unassigned memory in servers, e.g. a server with 4GB of free memory could never be used if the VMs to be deployed would need 8GB at least.

To reduce the VM’s memory footprint, ballooning and/or swapping can be employed to reclaim the memory VM [1]. In ballooning, the guest inflates a “balloon” to lock some memory at the VM which is then made available to the host (i.e. the hypervisor). Swapping is another technique to reclaim VM’s memory, where the hypervisor shrinks the allocated memory to the VM, hence swapping-out the memory of the VMs beyond a given limit. Recently, VMware also implemented a Transparent Page Sharing solution, where VMs memory pages with the same content are shared across multiple VMs (a.k.a. page deduplication) [1].

In the SigNet team, we proposed a solution mixing ballooning and a technique to detect the active memory pages of a VM (a.k.a. the working set) and then swapping out the inactive memory regions [3]. In [2], the authors propose to reclaim clean file-backed pages to enable memory-overbooking.

Internship objectifs

This project aims at understanding the impact of memory reclaim strategies on the network applications. More specifically, the student is expected to:

Build a state of the art on the different memory reclaim techniques proposed in the scientific literature and what is currently supported by the main Hypervisors (KVM, VMware, etc.)

Test/implement in a KVM hypervisor both the memory reclaim solutions and Working Set estimations heuristics. Our solution that mixes ballooning and working set detection will be provided to the student.

Study the impact of memory reclaim on the network applications (Web servers, in memory data bases, data streaming applications), from the networking point of view.

Impact on the request/reply rates of a given application

Impact on the delay and jitter of the traffic


Pre-requisites: Good background in programming and possibly system programming and also in networking

Bibliography

“Understanding Memory Resource Management in VMware® ESX™ Server”. https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/perf-vsphere-memory_management.pdf

Maxime Lorrillere, Julien Sopena, Sébastien Monnet, and Pierre Sens. 2015. Puma: pooling unused memory in virtual machines for I/O intensive applications. In Proceedings of the 8th ACM International Systems and Storage Conference (SYSTOR '15). Association for Computing Machinery, New York, NY, USA, Article 1, 1–11. DOI:https://doi.org/10.1145/2757667.2757669

D. Lopez Pacheco, Q. Jacquemart, A. Segalini, M. Rifai, M. Dione, and G. Urvoy-Keller. 2017. SEaMLESS: a SErvice migration cLoud architecture for energy saving and memory releaSing capabilities. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC '17). Association for Computing Machinery, New York, NY, USA, 645. DOI:https://doi.org/10.1145/3127479.3128604

Diwaker Gupta, Sangmin Lee, Michael Vrable, Stefan Savage, Alex C. Snoeren, George Varghese, Geoffrey M. Voelker, and Amin Vahdat. 2010. Difference engine: harnessing memory redundancy in virtual machines. Commun. ACM 53, 10 (October 2010), 85–93. DOI:https://doi.org/10.1145/1831407.1831429

Chiang, Jui-Hao, Han-Lin Li and Tzi-cker Chiueh. “Working Set-based Physical Memory Ballooning.” ICAC (2013).

Sean Barker, Timothy Wood, Prashant Shenoy, and Ramesh Sitaraman. 2012. An empirical study of memory sharing in virtual machines. In Proceedings of the 2012 USENIX conference on Annual Technical Conference (USENIX ATC'12). USENIX Association, USA, 25.





16- Asynchronous Contact Tracing, Fighting Pandemics with Internet of Things

Who? L. Liquori, Research Director INRIA, EPC Kairos

Supervisor: Luigi Liquori, H.d.R., Ph.D. INRIA Research Director

Co-supervisor: Abdul Qadir Khan (Ph.D candidate, ISEP Paris)

Mail: Luigi.Liquori@inria.fr

Telephone: -

Web page: www-sop.inria.fr/members/Luigi.Liquori

Where? INRIA

Place of the project: INRIA

Team: EPC KAIROS

Web page:

What?

Pre-requisites if any: good knowledge of IoT, mobile and web application, JSON, networking.

Detailed description: indicate the context of the work, what is

expected from the intern, what will be the outcome (software,

publication, …).

Asynchronous Contact Tracing (ACT) is a ETSI standard [Ref]. This project is in collaboration with the standardization bodies of European Telecommunication Standard Institute (ETSI). The Kairos INRIA/I3S/CNRS team participates to the ETSI SmartM2M Technical Committee working on protocol designed to fight against pandemics named “Asynchronous Contact Tracing”.

ACT (network protocol + appropriate IoT infrastructure based on SmartM2M/oneM2M + mobile and web applications) is conceived for regular, 'peace time' use, as opposed to Synchronous Contact Tracing methods which tend to be employed when society is put on an urgent, war footing in reaction to an acute problem. The ACT process is not only applicable to the current pandemic wave but also can be adapted to any other virus in a future pandemic intended for protecting and tracing users. ACT traces the contacts of objects with people and other objects and uses IoT technologies to react when a connected object may 'host' or 'has hosted' the virus and spread the virus to other people. It is intrinsically asynchronous because it does not require the exchange of any information between people, as the virus will be tracked back, or uncovered by doing (group) testing on objects and not on people. ACT will promote a quicker return to normal life, avoiding full lockdowns and suggesting selective lockdowns. ACT will beneficial to many social and industrial organizations, like e.g. cities, tourism, education, commerce, and travel, etc.

The main objective of the project is to implement some elements of the the protocol with the help of ETSI-oneM2M based middleware implementation platform ICON provided by Telecom Italia Mobile (TIM). The student will be introduced to the current state of ACT implementation [REF] and will continue to implementing the ACT infrastructure; in particular the student will concentrate on implementing of ACT Web Application and ACT Smart Application. Then, the student will run a scenario of ACT on ICON Platform and its external components. Performance evaluation study will conclude the internship. Student should have a good programming skills preferably Python (for web) and Java (for Android Mobile App).

Expected output : publication + SW + ETSI patent


References:

ETSI Work Item :

https://portal.etsi.org/webapp/workprogram/Report_WorkItem.asp?WKI_ID=62484

Official Standard ETSI (TS 103757v211) : en attaché.

https://www.etsi.org/deliver/etsi_ts/103700_103799/103757/02.01.01_60/ts_103757v020101p.pdf

https://hal.inria.fr/hal-02989793v3

Academic paper

https://hal.inria.fr/hal-02989404v3

Qadir Khan stage

https://hal.archives-ouvertes.fr/I3S/hal-03410027v1

video IOT WEEK @ ETSI april 2021

https://vimeo.com/536346457



17- Support of Monitoring and Radio Resource Management Services for 5G Wireless Access Networks


Who?

Name: Walid Dabbous & Thierry Turletti & Navid Nikaein

Mail: walid.dabbous@inria.fr & thierry.turletti@inria.fr & navid.nikaein@eurecom.fr

Phone: 0492387718

Web pages: https://team.inria.fr/diana/team-members/walid-dabbous/ & https://team.inria.fr/diana/team-members/thierry-turletti/ & https://www.eurecom.fr/fr/people/nikaein-navid

Where? Place of the project: Inria & Eurecom

Address: 06902 Sophia Antipolis

Teams: INRIA, Diana project-team & EURECOM COMSYS

Web page: https://www.inria.fr/equipes/diana & https://www.eurecom.fr/index.php/fr/directory/service/comsys

What?

Pre-requisites: Proficient programming skills in C/C++/Python and Linux. Experience with the cellular networks is a plus.

Description:

Digital Infrastructures as the future Internet, constitutes the cornerstone of the digital transformation of our society. Thanks to such infrastructures, new unforeseen applications will run on mobile devices with stringent latency and throughput requirements. To fulfill such requirements, where large amount of data are expected, different big cellular network vendors (e.g., Orange, Vodafone, T-Mobile or Telefonica) have joined their efforts and formed the ORAN alliance [1]. The ORAN alliance has defined a Near-Real Time Controller to dynamically control and monitor the cellular network.

This internship focuses on monitoring, control and coordination in the context of 5G RAN Intelligent Controller (RIC).

We will use the SDN-based Near-Real Time Controller developed at Eurecom running in the most advanced open-source 5G cellular network platform (i.e., OpenAirInterface [2]), which is capable of connecting and transferring data with 5G mobile devices. The current RIC already includes monitoring for MAC, RLC, PDCP and RRC layers, while the support of SDAP is in progress.The student will develop a library to retrieve a large amount of parameters generated by this platform to control and monitor the traffic flows of different applications using real hardware (i.e., antennas and mobile phones).

Work plan:

The student will start by a review of the various 5G parameters to monitor and analyze following O-RAN architecture [1].

Then, she/he will develop an application that could store and analyze the 5G data streams using the controller APIs so as to (a) create the 5G network topology showing the distribution of the base station and connected terminal, and (b) compute the performance on per base station and per terminal in terms of data rate and resource usage.

The student will then explore how to support control slicing and QoS traffic differenciation through the RIC. Several "service models" for radio resource management (RRM) could be added as RIC extensions such as mobility management and interference management. Finally, if time permits it, the student will explore data-driven Machine Learning and Artificial Intelligence algorithms that control the cellular network to achieve stringent latencies, with the goal of improving the mobile phone’s user experience.

This work is proposed in the context of the Slices-RI European project (https://slices-ri.eu/) including both INRIA and Eurecom partners. SLICES is a flexible platform designed to support large-scale, experimental research focused on networking protocols, radio technologies, services, data collection, parallel and distributed computing and in particular cloud and edge-based computing architectures and services.

This PFE internship may be continued in a PhD for excellent students.

References:

[1] https://www.o-ran.org/membership

[2] https://openairinterface.org/


18- Function as a Service for the Internet of Things: a systematic literature review

Who?

Name: Nicolas Ferry

Mail: nicolas.ferry@univ-cotedazur.fr

Web page: http://www.ferrynico.com

Where?

Place of the project: INRIA

Address: 2004 rte des Lucioles

Team: Kairos

Webpage: https://team.inria.fr/kairos/

What?

Pre-requisites if any: Knowledge about Cloud Computing, IoT, Serverless/FaaS

Detailed description: Function as a Service (FaaS) is a Cloud programming

model where developers focus on developing functions and are relieved from

building and maintaining the underlying infrastructure (from the hardware to the

execution environment). Functions are typically small programs triggered by

events, whose type and structure are generally defined by the FaaS provider.

Once a function is registered, it is the role of the FaaS provider to deploy and

operate it on the proper infrastructure. To optimize resource usage and reduce

cost, an entire function can be brought up in response to an event and down

when there are no events occurring. This event-based programming model is a

great fit for IoT event and data processing [2]. However, only very little solutions

emerged in the recent years to extend this model from the Cloud to the Edge, and

this is even worse when considering more constrained devices as can be found in

IoT systems.

The objective of the internship will be to conduct a literature review to make

sense of the research landscape of FaaS approach covering the Cloud-Edge-IoT

continuum. We will adopt the systematic literature methodology - i.e., “using a

well-defined methodology to identify, analyze and interpret all available evidence

related to a specific research question” [1]. In particular, we want to investigate

the following aspects: Can functions be deployed on IoT leaf nodes such as

arduinos, espruinos, etc? What are the mechanisms and properties offered

for serverless functions isolation, in particular when dealing with IoT devices?

How serverless functions are allocated, deployed, and integrated with external

services? How serverless functions are orchestrated and/or scheduled, and what

are the guarantees (logical, temporal, etc.) offered?

References:


1. Kitchenham, Barbara Ann and Charters, Stuart, « Guidelines for performing

Systematic Literature Reviews in Software Engineering », in EBSE 2007,

https://www.elsevier.com/__data/promis_misc/525444systematicreviewsgui

de.pdf

2. B. Cheng, G. Solmaz, F. Cirillo, E. Kovacs, K. Terasawa and A. Kitazawa,

“ FogFlow: Easy Programming of IoT Services Over Cloud and Edges for

Smart Cities ,” in IEEE Internet of Things Journal, vol. 5, no. 2, pp. 696-707,

April 2018, doi: 10.1109/JIOT.2017.2747214.

3. Bocci, A., Forti, S., Ferrari, GL. et al. Secure FaaS orchestration in the fog:

how far are we? Computing 103, 1025–1056 (2021).

https://doi.org/10.1007/s00607-021-00924-y

https://link.springer.com/article/10.1007/s00607-021-00924-y



19- Data acquisition and collection in harsh environment


Who?

Name: Christelle Caillouet and David Coudert

Mail: christelle.caillouet@inria.fr, david.coudert@inria.fr

Web page: http://www-sop.inria.fr/members/Christelle.Molle-Caillouet, http://www-sop.inria.fr/members/David.Coudert


Where?

Place of the project: Inria

Address: 2004 route des Lucioles, 06903 Sophia Antipolis

Team: Coati

Web page: https://team.inria.fr/coati/


Pre-requisites if any: Combinatorial optimization, Algorithmics, Programming, Wireless networks

Description:

Data collection is at the heart of the integrated management of road infrastructures and engineering structures. However, by definition, this data collection is carried out in very restrictive environments (e.g.: reduced accessibility), or even hostile environments (e.g.: weather conditions, luminosity, humidity, ...). The sensors are all of different natures with heterogeneous size, sensitivity and communication paradigms, providing heterogeneous data in size, type and frequency of acquisition. Collecting this data is a real challenge that requires the use of agile and adaptive communication protocols, the deployment of fleets of autonomous robots, and the planning of service vehicle routing.

Given application needs and technical constraints, the goal is to determine the most suitable locations for each data acquisition. This study will take into account the business needs but also the physical deployment constraints (accessibility, radio environment, possibility of power supply or ambient energy recovery) and needs for the data collection.

This will have an impact on the choice of the most suitable means to collect the data according to the sensors locations, the necessary frequency and the collection costs: radio collection, multi-hop, potential intervention of robots/drones, etc.

A first problem to consider is the deployment of a set of sensors to monitor a tunnel. The sensors have to send and relay data to a central sink. In a harsh environment, sensors are subject to failure and so several sensors may monitor the same area. We want to minimize the number of sensors to deploy in order to ensure a certain level of reliability. We will then investigate the tradeoff between the number of deployed sensors and the frequency at which the failed sensors must be replaced.

Other considerations may be taken into account concerning the volume of collected data, communication protocols, routing strategies…

Useful Information/Bibliography:

“A survey of optimization algorithms for wireless sensor network lifetime maximization”, Robert M. Curry, J. Cole Smith, Computers & Industrial Engineering 101 (2016) 145–166.

“Wireless sensor network for monitoring transport tunnels”, P. J. Bennett, Y. Kobayashi, K. Soga, and P. Wright, Proceedings of the Institution of Civil Engineers - Geotechnical Engineering, vol. 163, no. 3, pp. 147–156, Jun. 2010



20- How to enforce Network Slicing with Segment Routing?


Who : Joanna Moulierac <joanna.moulierac@inria.fr>, http://www-sop.inria.fr/members/Joanna.Moulierac/

and Geraldine Texier <geraldine.texier@imt-atlantique.fr>, https://www.imt-atlantique.fr/fr/personne/geraldine-texier

Where? To be discussed. Either at INRIA Sophia Antipolis (team COATI), or at IMT-Atlantique

Pre-requisites if any: Algorithmic, networking, and optimization

Detailed description: indicate the context of the work, what is expected from the intern, what will be the outcome (software, publication, ...).

With Network slicing multiple independent virtual networks are embedded on the same physical infrastructure, and Virtual Network Functions (VNF) are instantiated on specific physical nodes. Network slices are created to support a specific application, service, set of users, or network, and with its myriad use cases, it is one of the most important technologies in 5G.

Segment routing (SR) is a modern incarnation of the source routing paradigm. In addition to defining the destination of a packet the path can also be defined through a list of segments. Segment Routing thus gives great traffic engineering capabilities and packets can be manipulated so that they adhere to a specific requirement.

A segment can represent a topological instruction (node or link traversal) or any operator-defined instruction (e.g., virtual function). A Segment Routing Header (SRH) can be used to steer packets through paths with given properties (e.g., bandwidth or latency) and through various network functions (e.g., firewalling). The list of segments present in the SRH thus specifies the network policy that applies to the packet. Each SRH contains at least a list of segments and the segment to process is indicated by a pointer in the routing header. Segment routing enforces a flow through any topological path and service chain while maintaining per-flow state only at the ingress node to the segment routing domain. It can be directly applied to the MPLS architecture with no change on the forwarding plane or with the IPv6 architecture, with a new type of routing extension header where a segment is encoded as an IPv6 address.

Therefore, Segment Routing is a great technology to create, and instantiate dynamically the network slices. The purpose of this internship is to analyse the usability, effectiveness, and drawbacks of using segment routing to implement Network Slicing.

As an example, the overhead in the header can quickly become a limitation as it grows with the number of services inside a network slice. It should absolutely be taken into account as it will be quite substantial. The number of labels (Segment IDs - SID) in the segment list should not exceed the maximum segment list limit. By default, the maximum segment list limit is 5. This also means that the maximum transmission unit (MTU) inside the SR-domain must be large enough to encapsulate incoming packets. The performance impact is found to increase proportionally to the size of the segment list. There is a trade-off between the quality of paths, and the size of the headers.

After an extensive analysis of the literature, the student will discover the main difficulties to be overcomed when combining such technologies, and then propose efficient solutions and algorithms to solve them. The student will then perform some simulations to analyse the performance of the proposed solutions.

References: set of bibliographical references (article, books, white papers, etc) to be read by the student before starting to work on this subject.

[1] On Service Chaining and Segment Routing, Blaser, Johannes 8th June 2018 Supervisor(s): Dr. Grosso, Paola Co-Supervisor(s): MSc. Kaat, Marijke - https://scripties.uba.uva.nl/document/657875

[2] Service Function Chaining with Segment Routing Dissertation presented by Thibault GÉRONDAL, Nicolas HOUTAIN for obtaining the Master’s degree in Computer Science Options: Networking & Development Supervisors Olivier BONAVENTURE, David LEBRUN - https://dial.uclouvain.be/downloader/downloader.php?pid=thesis%3A4621&datastream=PDF_02

[3] Enabling traffic engineering over segment routing par Rabah Guedrez

Thèse de doctorat en Informatique, Sous la direction de Géraldine Texier et de Olivier Dugeon - https://tel.archives-ouvertes.fr/tel-02301017/

[4] CG4SR: Near Optimal Traffic Engineering for Segment Routing with Column Generation, M. Jadin, François Aubry, P. Schaus, O. Bonaventure, IEEE INFOCOM 2019 - https://ieeexplore.ieee.org/abstract/document/8737424


21- Automated deployment of open 5G networks


Name: Damien Saucez & Thierry Turletti & Thierry Parmentelat


Mail: damien.saucez@inria.fr & thierry.turletti@inria.fr & thierry.parmentelat@inria.fr

Phone: 0492387718


Web pages: https://team.inria.fr/diana/team-members/damien-saucez/ & https://team.inria.fr/diana/team-members/thierry-turletti/ & https://parmentelat.github.io/



Where? Place of the project: Inria

Address: 06902 Sophia Antipolis

Teams: INRIA, Diana project-team

Web page: https://www.inria.fr/equipes/diana


What?

Pre-requisites:

The student is expected to have a very strong understanding of network stacks (particularly L2 and L3) and cellular technologies.

Good programming skills is required.

The student shall be acquainted with containers technologies.

The student shall be acquainted with CI/CD practices and automation technologies.

Having a network associate certification (e.g., CCNA, SAA-C02, AZ-104) is a plus


Description:

For the general audience 5G means fast access to network, anywhere anytime. For us, network professionals, it is a bit different. Until now (aka 4G) all functions needed to ensuring the connectivity were glued together in the same device, deployed directly along the antennas. With 5G the paradigm changes and these functions can be virtualised. The first advantage is to break vendor lock-in, which is essential to cut costs. A second advantage is to be able to deploy these functions in different locations and potentially to run instances from different access nodes in the same compute node.


OpenAirInterface (OAI) offers a full software stack for 5G and was designed with functions virtualisation in mind. As long as you have programable radios, high speed-low latency links, and solid compute resources, you can deploy your own 5G network [1]. Hand in hand with the OpenAirInterface Software Alliance we are deploying a unique testing platform for OpenAirInterface. It is composed of several programmable radios deployed in an anechoic chamber called R2Lab [2], high speed fibre links (6x100Gbps) between Inria and Eurecom and compute clusters, all that interconnected with programmable switches. In a next stage, this testing platform will be interconnected with international partners.


In this project you will be integrated in a the team that designs and develops automation solutions to automatically provision and configure the entire stack, encompassing the deployment of OAI functions, the services, and the physical network and compute resources. Ultimately, with a click of a button a full operational 5G network shall be deployed.


Workplan:

Your role will be to:

- define and fine tune the network configurations of the entire network that correspond to the different service deployments

- define the deployments configuration of containers in the cluster that correspond to the different service deployments

- provide tests that guarantee that automated deployments won’t alter the rest of the services deployed in the infrastructure

- implement the API allowing end-users to define their deployments

- implement the workflow of operations needed to pass from end-user definition of needs to the verification of the possibility to deploy the requested services, to the actual deployment and eventually to the decommission of the deployment, without interruption or alterations of running services.

- study OpenAirInterface (OAI) and how to deploy 4G and 5G network functions.

- implement the following 5G scenarios:

* a first scenario, single tenant, where all 5G Radio Access Network (RAN) and Core Network (CN) functions are implemented locally at INRIA

* and a second scenario, multi-tenancy, where the CN network functions are deployed at Eurecom while the RAN network functions are deployed at INRIA.

In both cases, the 5G hardware including 5G User Equipements (UEs), and gNB (5G base stations) will be deployed within the R2lab anechoic platform at INRIA.


References:


[1] https://openairinterface.org/

[2] https://r2lab.inria.fr/


Stage-DFG-NN-2021-2022-Touati-Formenti-RD.pdf