What?
Title: Exploring memory performance for large Python computations
Who?
Name: Arnaud Legout
Mail: arnaud.legout@inria.fr
Web page: https://www-sop.inria.fr/members/Arnaud.Legout/
Name: Damien Saucez
Mail: damien.saucez@inria.fr
Web page: https://team.inria.fr/diana/team-members/damien-saucez/
Where?
Place of the project: DIANA team, Inria, Sophia Antipolis
Address: 2004 route des Lucioles
Team: DIANA
Web page: https://team.inria.fr/diana/
Pre-requisites if any: Familiar with Python, Linux, basic system performance knowledge, highly
motivated to work in a research environment and exited to tackle hard problems.
Description:
Making datascience is not only a programming or machine learning issue, it is also
a system issue for most practical use cases. One of which is to make the best use of the available RAM.
Making computations requiring a large amount of memory that exceeds the available RAM
require the OS the swap memory pages on disk. Even in case your process does not exceed the
available RAM, the Linux memory management may proactively swap memory pages.
We observed that under certain circumstances, the swap process dramatically
reduces the performance leading to a pathological behavior in which retrieving pages
from the swap becomes much slower than the disk speed.
The goal of this PER is to understand and document the current Linux memory management,
how it interacts with the Python interpreter, and how to reproduce the
circumstances under which we enter a pathological behavior.
Excellent students will also have the opportunity to study the impact of virtualization
running multiple memory intensive workloads.
This PER requires a good understanding of the internals of the Linux operating system and
a good knowledge of C and Python. It will be mandatory to look at Linux
and Python source code (written in C) to understand the details and the undocumented
behavior.
Also a part of the PER will be to run experiments to reproduce and understand the conditions
under which we observe a pathological behavior.
This PER will continue for motivated students on an internship in which the intern will
have to tackle the pathological behavior and propose a solution. Students will
have the possibility to continue for a Master thesis and excellent students for a Ph.D. thesis.
Useful Information/Bibliography:
What every programmer should know about memory
https://lwn.net/Articles/250967/
Python memory management
https://docs.python.org/3/c-api/memory.html
Memory Management
What?
Title: Impact of Large Language Model on the Cognitive Process
Who?
Name: Arnaud Legout
Mail: arnaud.legout@inria.fr
Web page: https://www-sop.inria.fr/members/Arnaud.Legout/
Name: Christopher Leturc
Mail: leturc@i3s.unice.fr
Web page: https://www.linkedin.com/in/christopher-leturc
Name: Fanny Verkampt
Mail: Fanny.VERKAMPT@univ-cotedazur.fr
Web page: https://www.linkedin.com/in/fanny-verkampt-0918b74b/
Where?
Place of the project: DIANA team, Inria, Sophia Antipolis
Address: 2004 route des Lucioles
Team: DIANA
Web page: https://team.inria.fr/diana/
Pre-requisites if any: Python and web development, basic knowledge on how LLM work
Description:
Large Language Models (LLMs) have revolutionized artificial intelligence over the past three years.
Humans interact with LLMs mainly in two ways: through conversational agents such as ChatGPT, and
through streaming agents such as GitHub Copilot.
With a conversational agent, users engage in dialogue much like they would with another human being.
In contrast, a streaming agent provides suggestions in real time as one types—an interaction that is
more direct and intrusive.
These two modes of interaction may influence human cognitive processes in different ways. We tend
to treat conversational agents as if they were reputable human interlocutors, whereas streaming
agents intervene more directly in our chain of thought. However, little is known about how these
different types of LLMs affect cognition. In particular, can an LLM influence, or even alter, the
thinking of the human it assists?
The goal of this PER is to review the existing literature on this topic, propose scientific
hypotheses, and begin designing experiments to explore these hypotheses.
This PER will continue for motivated students on an internship. Excellent student will
have the possibility to continue for a Ph.D. thesis.
Useful Information/Bibliography:
Erik Jones and Jacob Steinhardt. “Capturing failures of large language models via human cognitive biases”. In:
Advances in Neural Information Processing Systems 35 (2022), pp. 11785–11799.
Enkelejda Kasneci et al. “ChatGPT for good? On opportunities and challenges of large language models for
education”. In: Learning and individual differences 103 (2023), p. 102274.
Celeste Kidd and Abeba Birhane. “How AI can distort human beliefs”. In: Science 380.6651 (2023), pp. 1222–1223
Bill Thompson and Thomas L Griffiths. “Human biases limit cumulative innovation”. In: Proceedings of the Royal
Society B 288.1946 (2021), p. 20202752.
Canyu Chen and Kai Shu. “Can LLM-Generated Misinformation Be Detected?” In: arXiv preprint arXiv:2309.13788 (2023).
Using Early Exit Networks and Cascade Models to Reduce Inference Times and the Resource Usage in Edge Computing Scenarios
Name: Frédéric Giroire et Davide Ferré
Mail: frederic.giroire@inria.fr
Web page: https://www-sop.inria.fr/members/Frederic.Giroire/
Place of the project:
Address: Inria, 2004 route de Lucioles, SOPHIA ANTIPOLIS
Team: COATI (common project Inria/I3S)
Web page: https://team.inria.fr/coati/
Pre-requisites:
Knowledge in networking and machine learning.
Python.
Description:
The exponential advances in Machine Learning (ML) are leading to the deployment of Machine Learning models in constrained and embedded devices, to solve complex inference tasks. At the moment, to serve these tasks, there exist two main solutions: run the model on the end device, or send the request to a remote server. However, these solutions may not suit all the possible scenarios in terms of accuracy or inference time, requiring alternative solutions.
Cascade inference is an important technique for performing real-time and accurate inference given limited computing resources such as MEC servers. It combines more than two models to perform inference: a highly-accurate but expensive model with a low-accuracy but fast model, and determines whether the expensive model should make a prediction or not based on the confidence score of the fast model. A large pool of works exploited this solution. The first ones to propose a sequential combination of models were [1] for face detection tasks, then, in the context of deep learning, cascades have been applied in numerous tasks [2,3].
Early Exit Networks take advantage of the fact that not all input samples are equally difficult to process, and thus invest a variable amount of computation based on the difficulty of the input and the prediction confidence of the Deep Neural Network [5]. Specifically, early-exit networks consist of a backbone architecture with additional exit heads (or classifiers) along its depth. At inference time, when a sample propagates through the through the network, it passes through the backbone and each of the exits in and the result that satisfies a predetermined criterion (exit policy) is (exit policy) is returned as the prediction output, bypassing the rest of the the rest of the model. In fact, the exit policy can also reflect the capabilities and load of the target device, and dynamically adapt the network to meet specific runtime requirements [6].
Our project is to use cascade models and/or early-exit models in the context of Edge Computing to improve the delay and reduce the resource usage of ML inference tasks at the edge. Of crucial importance for cascade models or early-exit models, is the confidence of the fast model. Indeed, if the prediction of the first model is used but wrong, it may lead to a low accuracy of the cascade model, even if the accuracy of the best model is very high. Similarly, if the first model confidence is set too low, it will never be used, and the computations will be higher than using only the second model by itself, additionally, we will use unnecessary network resources and have higher deals than necessary. Researchers have proposed methods to calibrate such systems [4]. However, they have not explored the choice of the loss function of such systems in depth.
In this project, we will explore the use of a new loss function for the fast models (or first exit) of cascade networks (of early-exit models). Indeed, such networks do not have the same goal as the global system, as they should only act as a first filter.
Useful Information:
The internship can be followed by a PhD for interested students. A PhD grant is already funded on the topic.
Bibliography:
[1] Viola, P., & Jones, M. (2001, December). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001 (Vol. 1, pp. I-I). Ieee.
[2] Wang, X., Kondratyuk, D., Christiansen, E., Kitani, K. M., Alon, Y., & Eban, E. (2020). Wisdom of committees: An overlooked approach to faster and more accurate models. arXiv preprint arXiv:2012.01988.
[3] Wang, X., Luo, Y., Crankshaw, D., Tumanov, A., Yu, F., & Gonzalez, J. E. (2017). Idk cascades: Fast deep learning by learning not to overthink. arXiv preprint arXiv:1706.00885.
[4] Enomoro, S., & Eda, T. (2021, May). Learning to cascade: Confidence calibration for improving the accuracy and computational cost of cascade inference systems. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 8, pp. 7331-7339).
[5] Laskaridis, S., Kouris, A., & Lane, N. D. (2021, June). Adaptive inference through early-exit networks: Design, challenges and directions. In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning (pp. 1-6).
[6] Laskaridis, S., Venieris, S. I., Almeida, M., Leontiadis, I., & Lane, N. D. (2020, September). SPINN: synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th annual international conference on mobile computing and networking (pp. 1-15).
What?
Title: Evolution over time of social networks
Who?
Name: Frédéric Giroire, Sayf Halmi, Nicolas Nisse
Mail: Nicolas.nisse@inria.fr
Web page: https://www-sop.inria.fr/members/Nicolas.Nisse/
Where?
Place of the project: COATI team, Inria, Sophia Antipolis
Address: 2004 route des Lucioles
Team: COATI
Web page: https://team.inria.fr/coati/
Pre-requisites if any: basic knowledge in graphs, algorithms and Python
Description: The goal of the project is to develop methods to analyse the evolution over time of a social network. We will consider as example the graph of scientific collaborations as we already have collected data from SCOPUS. The project will have two phases:
- Data analysis: In the first phase, the student will use our data set to study different metrics describing the evolution of the productivity of scientists. It will be interesting to compare the obtained results depending on the level of multidiscipinary of the scientist, their domain of research…
- Then, the student will compare the experimental results together with models for generating random graphs, namely models of Barabasi-Albert kind (preferential attachment models) with different functions of preferential attachements. This relies on our on going work where we have considered piecewise linear functions and studied the characteristics of obtained graphs (e.g. degree distribution, evolution of the degree of the nodes…)
Useful Information/Bibliography:
- [GNOST 23] Frédéric Giroire, Nicolas Nisse, Kostiantyn Ohulchanskyi, Malgorzata Sulkowska, Thibaud Trolliet: Preferential Attachment Hypergraph with Vertex Deactivation. MASCOTS 2023: 1-8
- [GNTS22] Frédéric Giroire , Nicolas Nisse , Thibaud Trolliet , Malgorzata Sulkowska : Preferential attachment hypergraph with high modularity. Netw. Sci. 10(4): 400-429 (2022)
- [Trolliet21] Thibaud Trolliet: Study of the properties and modeling of complex social graphs. (Étude des propriétés et modélisation de graphes sociaux complexes). University of Côte d'Azur, Nice, France, 2021
- [BA99] A.L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999.
Name: Frédéric Giroire et Davide Ferré
Mail: frederic.giroire@inria.fr
Web page: https://www-sop.inria.fr/members/Frederic.Giroire/
Place of the project:
Address: Inria, 2004 route de Lucioles, SOPHIA ANTIPOLIS
Team: COATI (common project Inria/I3S)
Web page: https://team.inria.fr/coati/
Pre-requisites: Knowledge in networking and machine learning. Python.
Description:
In recent years, there has been a growing usage of Machine Learning (ML) models in cloud computing, contributing to the adoption of Machine Learning as a Service (MLaaS) [1], studied in several applications such as image recognition and self-driving cars [2]. Cloud and network operators have faced challenges in developing efficient strategies for utilizing computational resources to support machine learning tasks. Among these challenges, scheduling is an important one. A scheduler must determine on which machine each task is executed and its processing order. This becomes especially critical when tasks must adhere to deadline constraints.
In the context of saving computational resources, researchers have investigated neural network compression techniques, including pruning and quantization . However, these approaches typically involve compressing the model during the training phase, necessitating re-training the model after compression. Recent approaches compress neural networks at inference time [3, 28], reducing network size to varying degrees. Greater compression yields lower latency (i.e., the processing time of a task) but at the expense of accuracy.
In [3], we introduced a scheduling system using compressible neural networks for image classification tasks, in which several heterogeneous machines could be used. We developed an approximation algorithm with proven guarantees for maximizing the average accuracy while respecting deadlines constraints. In [4], we proposed scheduling algorithms to maximize accuracy while adhering to an energy budget constraint. Indeed, cloud and network operators are compelled to mitigate their cloud carbon footprint, driving researchers and scientists to investigate novel methods for conducting ML inference with greater energy efficiency. The adoption of MLaaS and the expanding size of neural network models has resulted in increased energy consumption, particularly during the inference stage [5]. According to reports from NVIDIA [9], 80-90% of Artificial Intelligence (AI) costs stem from inference. Indeed, processing millions of requests, such as those encountered in social networks, can lead to a significant number of inferences in deep learning models, resulting in elevated energy consumption and a sizable carbon footprint [6].
In this project, we will study how to extend the proposed solution using ML learning model compression to an edge computing context [7]. In this context, a model can be either executed in the cloud or closer to the user in its cell phone or in an antenna. We will propose optimization models and algorithms to find efficient solutions to schedule ML tasks.
Useful Information:
The TER can be followed by an internship and by a PhD for interested students. A PhD grant is already funded on the topic.
References
[1] MauroRibeiro,KatarinaGrolinger,andMiriamAMCapretz.2015.Mlaas:Machine learning as a service. In 2015 IEEE 14th international conference on machine learning and applications (ICMLA). IEEE, 896–902.
[2] Suman Raj, Harshil Gupta, and Yogesh Simmhan. 2023. Scheduling dnn inferencing on edge and cloud for personalized uav fleets. In IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid).
[3] T. da Silva Barros, F. Giroire, R. Aparicio-Pardo, S. Perennes, and E. Natale, “Scheduling with Fully Compressible Tasks: Application to Deep Learning Inference with Neural Network Compression,” in CCGRID 2024 - 24th IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing, (Philadelphia, United States), IEEE/ACM, May 2024.
https://ieeexplore.ieee.org/document/10701353
[4] T. da Silva Barros, D. Ferre, F. Giroire, R. Aparicio-Pardo, and S. Perennes, “Scheduling Machine Learning Compressible Inference Tasks with Limited Energy Budget,” in ACM Digital Library, vol. 32 of ICPP ’24: Proceedings of the 53rd International Conference on Parallel Processing, (Gotland, Sweden), pp. 961 – 970, ACM, Aug. 2024.
https://hal.science/hal-04676376/document
[5] J. McDonald, B. Li, N. Frey, D. Tiwari, V. Gadepally, and S. Samsi. 2022. Great power, great responsibility: Recommendations for reducing energy for training language models. arXiv preprint arXiv:2205.09646 (2022).
[6] Radosvet Desislavov, Fernando Martínez-Plumed, and José Hernández-Orallo. 2023. Trends in AI inference energy consumption: Beyond the performance-vs- parameter laws of deep learning. Sustainable Computing: Informatics and Systems 38 (2023), 100857.
[7] Varghese, B., Wang, N., Barbhuiya, S., Kilpatrick, P., & Nikolopoulos, D. S. (2016, November). Challenges and opportunities in edge computing. In 2016 IEEE international conference on smart cloud (SmartCloud) (pp. 20-26). IEEE.
[8] F. Giroire, N. Huin, A. Tomassilli, and S. Pérennes. Data center scheduling with network tasks. in IEEE Transactions on Networking (ToN), 2025.
https://ieeexplore.ieee.org/document/11048391
Who?
Name: Fabrice Huet, Dino Lopez Pacheco
Mail: fabrice.huet@univ-cotedazur.fr, dino.lopez@univ-cotedazur.fr
Where?
Place of the project: I3S Lab
Address: 2000 route des lucioles
Team: Scale/SigNet
Web page:
Pre-requisites if any:
• Python
• Networking
• System and Containers
Description:
Information and Communication Technology (ICT) energy consumption represents
between 4% and 9% of worldwide energy consumption, which represents between
1.4% and 4% of Green House Gas (GHG) emissions. Energy and GHG emissions from
ICT are also growing year by year [1]. Hence, decreasing the GHG emission footprint
from ICT is crucial to tackle the current global warming trend.
One of the main factors behind the ever-growing energy demand of ICT is the incredibly
high popularity of generative Artificial Intelligence (AI). Indeed, as the popularity of
generative AI increases, the number of models behind them increases, as well as their
size (measured in number of parameters) which is believed to have a direct impact on
model performance.
The increasing number of larger AI models has driven both the increase in the number of
Data Center (DC) facilities and the expansion of existing ones, leading to huge energy
demands [3].
Some reports exist on the energy consumption of training and inference of very large
Large Language Models (such as the BLOOM 176B [2]). A PFE and internship have already explored the power usage of existing models as a function
of the model type and the number of parameters. As a result, we now have a framework to quickly set up experiments and measure power usage directly on the GPU and the PDU. The goal of this PFE is to advance this work by analyzing the power as a function of the accuracy and quality of the results. In this project, the student will be
required to:
1. Perform a state-of-the-art analysis of benchmarks for generative AI with a focus on accuracy
2. Expand the existing experimental framework to run existing benchmarks
3. Perform extensive experiments
This project can be pursued as an internship.
Useful Information/Bibliography:
[1] Erol Gelenbe. “Electricity Consumption by ICT: Facts, trends, and measurements.”
Ubiquity 2023, August, Article 1, 15 pages. https://doi.org/10.1145/3613207
[2] Alexandra Sasha Luccioni, Sylvain Viguier, and Anne-Laure Ligozat. “Estimating the
carbon footprint of BLOOM, a 176B parameter language model.” J. Mach. Learn. Res. 24,
1, Article 253 (January 2023), 15 pages.
[3] The Washington Post, “AI is exhausting the power grid. Tech firms are seeking a miracle
solution”. Last access: sept 2024.
https://www.washingtonpost.com/business/2024/06/21/artificial-intelligence-nuclearfusion-climate/
Who?
Name: Chadi Barakat and Thierry Turletti
Mail: Chadi.Barakat@inria.fr and Thierry.Turletti@inria.fr
Web page: https://team.inria.fr/diana/team-members/chadi/ and https://team.inria.fr/diana/team-members/thierry-turletti/
Where?
Place of the project: Inria centre at Université Côte d'Azur
Address: 2004, route des lucioles, 06902 Sophia Antipolis, France
Team: Diana
Web page: https://team.inria.fr/diana/
Pre-requisites if any:
Solid background in networking and wireless technologies, strong skills in system programming (C/C++, python, shell).
Description:
Cellular 5G networks deploy various mechanisms to combat wireless errors, ranging from Hybrid ARQ retransmissions to correct corrupted frames, to adaptive modulation to improve link quality and reduce the block error rate. These mechanisms are tuned based on real-time reports from UEs (User Equipments) about their network conditions (e.g., channel state information, block error rate, SINR) [1,2,3,4]. Given the diversity of wireless environments experienced by different UEs, and the various forms noise can take (e.g., affecting only parts of the spectrum), it is crucial that UEs are allocated the best quality subcarriers based on their individual conditions to maximize performance, independently of the conditions experienced by others. It is also essential that the allocation of frames to resource blocks is optimized to maximize total throughput across the wireless link, under any distribution of noise across UEs and subcarriers. The goal of this TER is to understand how error recovery mechanisms are implemented in current open-source platforms such as OAI [5] and srsRAN [6], and to carry out experiments evaluating the effectiveness of these mechanisms in achieving their objectives. The work will involve conducting experiments on our SLICES-RI platform [7] and in our R2Lab anechoic chamber [8]. By introducing artificial wireless noise in different forms and running Internet traffic in both directions with multiple UEs, we will assess how efficiently the deployed mechanisms exploit the wireless link in terms of achieved throughput, and whether there is any bias in resource allocation that may cause fairness issues among UEs. We will also evaluate whether the deployed link monitoring mechanisms are effective in providing both the base station and the UEs with an accurate view of the wireless channel and the condition of the different subcarriers composing it by for example using the tool described [9]. This TER can be extended into an internship for motivated students who demonstrate strong skills in conducting the planned experiments and analyzing the results.
Useful Information/Bibliography:
[1] Sesha Sai Rakesh Jonnavithula, Ish Kumar Jain, and Dinesh Bharadia. 2024. MIMO-RIC: RAN Intelligent Controller for MIMO xApps. In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking (ACM MobiCom '24).
[2] Xenofon Foukas, Bozidar Radunovic, Matthew Balkwill, and Zhihua Lai. 2023. Taking 5G RAN Analytics and Control to a New Level. In Proceedings of the 29th Annual International Conference on Mobile Computing and Networking (ACM MobiCom '23).
[3] N. Saha, N. Shahriar, M. Sulaiman, N. Limam, R. Boutaba and A. Saleh, "Monarch: Monitoring Architecture for 5G and Beyond Network Slices," in IEEE Transactions on Network and Service Management, vol. 22, no. 1, pp. 777-790, Feb. 2025.
[4] A. Kak, V. -Q. Pham, H. -T. Thieu and N. Choi, "RANSight: Programmable Telemetry for Next-Generation Open Radio Access Networks," GLOBECOM 2023 - 2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 2023, pp. 5391-5396.
[5] OpenAirInterface, https://openairinterface.eurecom.fr/
[6] srsRAN, Project Open Source RAN, https://www.srslte.com/
[7] SLICES-RI, Scientific Large Scale Infrastructure for Computing/Communication Experimental Studie, https://www.slices-ri.eu/
[8] R2lab anechoic chamber, https://r2lab.inria.fr/
[9] Sesha Sai Rakesh Jonnavithula, Ish Kumar Jain, and Dinesh Bharadia. 2024. BeamArmor5G: Demonstrating MIMO Anti-Jamming and Localization with srsRAN 5G Stack. In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking (ACM MobiCom '24).
Who?
Name: Thierry Turletti and Walid Dabbous and Chadi Barakat
Mail: Thierry.Turletti@inria.fr and Walid.Dabbous@inria.fr and Chadi.Barakat@inria.fr
Web page: https://team.inria.fr/diana/team-members/thierry-turletti/ and https://team.inria.fr/diana/team-members/walid-dabbous/ and https://team.inria.fr/diana/team-members/chadi/
Where?
Place of the project: Inria centre at Université Côte d'Azur
Address: 2004, route des lucioles, 06902 Sophia Antipolis, France
Team: Diana
Web page: https://team.inria.fr/diana/
Pre-requisites if any:
Solid background in networking and wireless technologies, strong skills in system programming (C/C++, python, shell).
Description:
Ultra-Reliable Low-Latency Communications (URLLC) are critical for emerging 5G applications such as industrial automation, autonomous vehicles, and remote medical applications. Achieving millisecond-level end-to-end latency requires a deep understanding of both 5G protocols and real-time software optimization.This TER will focus on state-of-the-art open-source 5G stacks, specifically OAI5G[3] and srsRAN[4], to study, implement, and evaluate latency reduction techniques. The first step is a literature and state-of-the-art review, with particular attention to two recent studies [1,2]:The student will:• Analyze and reproduce the results reported in these studies.• Identify performance bottlenecks in both OAI5G and srsRAN stacks.• Explore how to possibly combine the various optimizations proposed to maximize performance of 5G stacks.• Conduct experiments on a state-of-the-art 5G testbed, measuring real-time network performance and end-to-end latency improvements.Expected outcomes include a comprehensive evaluation of latency optimization techniques, a comparative study between OAI5G and srsRAN, and practical guidelines for applying these methods to achieve URLLC-grade performance in open-source 5G networks.This TER is ideal for students passionate about wireless communications, real-time systems, and experimental 5G research, combining theoretical study with hands-on implementation and performance evaluation. It can be extended into an internship for motivated students who demonstrate strong skills in conducting the planned experiments and analyzing the results. This TER is proposed in the context of the SLICES[5] European project and the national Priority Research Programme and Equipment (PEPR) on 5G, 6G and Networks of the Future[6].
Useful Information/Bibliography:
[1] T. Tsourdinis, N. Makris, T. Korakis and S. Fdida, "Demystifying URLLC in Real-World 5G Networks: An End-to-End Experimental Evaluation," GLOBECOM 2024 - 2024 IEEE Global Communications Conference, Cape Town, South Africa, 2024, pp. 2954-2959, doi: 10.1109/GLOBECOM52923.2024.10901776.
[2] Gong, A., Maghsoudnia, A., Cannatà, R., Vlad, E., Lomba, N. L., Dumitriu, D. M., & Hassanieh, H. (2025, September). Towards URLLC with Open-Source 5G Software. In Proceedings of the 1st Workshop on Open Research Infrastructures and Toolkits for 6G (pp. 7-14). https://doi.org/10.1145/3750718.3750743
[3] OpenAirInterface, https://openairinterface.org/
[4] srsRAN project https://www.srslte.com/
[5] ESFRI SLICES European project, https://www.slices-ri.eu/what-is-esfri/
[6] PEPR on 5G, 6G and Networks of the Future, https://pepr-futurenetworks.fr/en/home/
What?
Title: Optimizing Linux and hardware for Real-Time Network Workloads
The performance of computer workloads results from complex interactions between hardware, software, and operating system behavior. By default, operating systems such as Linux are designed to be general-purpose, balancing flexibility, performance, and energy efficiency. However, this generality often results in suboptimal performance. Significant speedups, particularly with NUMA architecture, can be achieved by carefully tuning kernel and hardware parameters to better match the specific characteristics of the workload. This issue becomes particularly critical when dealing with real-time network traffic, which demands high processing power, low latency, and high bandwidth.
This project will investigate performance optimization for real-time network processing, with a particular focus on workloads such as 5G RAN traffic and network security applications. The work will proceed in four main stages:
1. Benchmark Development – Design and implement a reproducible synthetic benchmark that models real-time network traffic.
2. Bottleneck Analysis– Build and apply performance analysis tools to identify system bottlenecks (e.g., CPU saturation, queueing delays, shared bus I/O constraints).
3. System Tuning – Derive and apply kernel and hardware parameter adjustments to mitigate the identified bottlenecks, and evaluate the resulting performance improvements compared to default configurations.
4. Multi-Workload Scenarios – Extend the study to environments where multiple workloads share resources under Kubernetes, assessing the trade-offs and optimization opportunities in such settings.
The project combines benchmarking, systems performance analysis, and applied optimization, providing practical insights into how Linux systems can be tailored for demanding real-time network workloads. Students will have access to different computer platforms and operating systems. Extension to other types of workload (e.g., memory intensive) will be considered.
Who?
Name: Walid Dabbous
Mail: walid.dabbous@inria.fr
Web page: https://team.inria.fr/diana/team-members/walid-dabbous/
Name: Damien Saucez
Mail: damien.saucez@inria.fr
Web page: https://team.inria.fr/diana/team-members/damien-saucez/
Where?
Place of the project: DIANA team, Inria, Sophia Antipolis
Address: 2004 route des Lucioles
Team: DIANA
Web page: https://team.inria.fr/diana/
Pre-requisites if any:
Familiar with Linux, basic system performance knowledge, highly
motivated to work in a research environment and exited to tackle hard problems.
Description:
This PER requires a good understanding of the computer and Linux kernel architectures.
This PER will continue for motivated students on an internship.
Useful Information/Bibliography:
The Linux kernel
https://linux-kernel-labs.github.io/refs/heads/master/lectures/intro.html
• Title: Solving Combinatorial Problems using Positional Encoding
• Who ? With Pierre Pereira, 2nd year PhD student at COATI, Inria. Webpage: https://pierrot-lc.dev. Mail: pierre.pereira@inria.fr. Co-supervised by Frédéric Giroire, my PhD supervisor.
• Where? At COATI, Inria Sophia Antipolis.
• Pre-requisites: This project involves a lot of experimentation. The student is required to be proficient in Python and PyTorch.
• Description: Designing heuristics to solve combinatorial problems can be time consuming and is generally highly specific to the problem at hand. On the contrary, neural networks are a promising way of applying the same method to widely different types of problems, simply by adapting the training dataset. The goal of this project is to apply a recent learning approach to a wide variety of CO problems to validate its general applicability across the domain. This approach has been successfully applied to the TSP problem, leading to a paper currently under review for ICLR 2026.
• Bibliography: Pierre Pereira, Frederic Giroire, Emanuele Natale. Solving the Traveling Salesman Problem with Positional Encoding. 2025. https://hal.science/hal-05295614
Supervisors: Sara Alouf (Inria NEO team), Kyrylo Tymchenko (Inria NEO team)
Mail: sara.alouf@inria.fr, kyrylo.tymchenko@inria.fr
Place: Inria Sophia Antipolis
Pre-requisites: Knowledge in networking and probability theory, solid programming skills are a plus to continue during the internship
Address: 2004 route des Lucioles, 06902 Sophia Antipolis
The project may be followed by an internship for interested students.
Description:
In recent years, there has been a steady evolution of edge computing, storage, content delivery, and AI services, all moving closer to end-users [1]. This shift is primarily driven by the need to reduce latency, limit bandwidth consumption, and improve the overall throughput and responsiveness of the distributed systems.
In edge environments, caching plays a vital role in reducing latency and avoiding redundant requests to origin servers. However, due to intense performance demands, high request volumes, and inevitable hardware failures, server unavailability events are a common occurrence [2]. For example, traces from Akamai’s CDN indicate an average of 5.6 such events per cluster per day [3]. These incidents are typically short-lived but highly disruptive, triggering spikes in cache miss rates and increasing average latency [3]. The result is a degraded quality of service (QoS) and a diminished user experience.
A further consequence of transient server failures is load imbalance, often due to bucket assignment algorithms that co-locate requests by vendor or content group [4]. When one or more servers go offline, their load is redistributed unevenly, increasing eviction rates and accelerating SSD wear on the remaining nodes. This degrades both efficiency and maintenance costs.
To mitigate these problems, selective replication—based on content popularity and size—is widely used in industry [5]. Replication reduces cache miss spikes but comes at a high storage cost and can worsen load imbalance, limiting scalability and cost-effectiveness.
Erasure coding is a widely used space efficient alternative to replication [6, 7, 8]. Traditionally used in storage systems [9] and network communication for error correction [10], erasure coding has recently become more viable in caching and content delivery contexts due to advances in computational optimization [3, 11, 12]. Studies show that combining erasure coding with smart fragment placement and load-balancing algorithms can significantly improve fault tolerance, reducing cache miss spikes and I/O imbalance.
Yet, the application of erasure coding in edge environments is still relatively underexplored, leaving many optimization opportunities open. For example, placement algorithms that minimize both read and write imbalance could lower latency while reducing SSD wear. Similarly, adaptive coding schemes that adjust redundancy based on content popularity could balance performance and storage overhead dynamically, leaving more space for useful data.
The project will begin with a study of the current state of the art in the literature. The next step will be to explore the strategies to improve load balancing, storage overhead and fault tolerance. Finally, the student will develop a framework to test the performance of these strategies under different types of workloads.
References
[1] Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. Edge computing: Vision and challenges. IEEE internet of things journal, 3(5):637–646, 2016. [2] Erik Nygren, Ramesh K Sitaraman, and Jennifer Sun. The akamai network: a platform for high-performance internet applications. ACM SIGOPS Operating Systems Review, 44(3):2–19, 2010. [3] Juncheng Yang, Anirudh Sabnis, Daniel S Berger, KV Rashmi, and Ramesh K Sitaraman. C2DN: How to harness erasure codes at the edge for efficient content delivery. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 1159–1177, 2022. [4] Bruce M Maggs and Ramesh K Sitaraman. Algorithmic nuggets in content delivery. ACM SIGCOMM Computer Communication Review, 45(3):52–66, 2015. [5] Ganesh Ananthanarayanan, Sameer Agarwal, Srikanth Kandula, Albert Greenberg, Ion Stoica, Duke Harlan, and Ed Harris. Scarlett: coping with skewed content popularity in mapreduce clusters. In Proceedings of the sixth conference on Computer systems, pages 287–300, 2011. [6] Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. Erasure coding in windows azure storage. In 2012 USENIX Annual Technical Conference (USENIX ATC 12), pages 15–26, 2012. [7] Saurabh Kadekodi, Francisco Maturana, Suhas Jayaram Subramanya, Juncheng Yang, KV Rashmi, and Gregory R Ganger. PACEMAKER: Avoiding HeART attacks in storage clusters with disk-adaptive redundancy. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 369–385, 2020. [8] Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, et al. f4: Facebook’s warm BLOB storage system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 383–398, 2014. [9] Jad Darrous and Shadi Ibrahim. Understanding the performance of erasure codes in hadoop distributed file system. In Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems, pages 24–32, 2022. [10] Bernard Fong, Predrag B Rapajic, Guan Y Hong, and Alvis Cheuk M Fong. Forward error correction with reed-solomon codes for wearable computers. IEEE Transactions on Consumer Electronics, 49(4):917–921, 2003.
Name: Frédéric Giroire et Francesco Diana
Mail: frederic.giroire@inria.fr
Web page: https://www-sop.inria.fr/members/Frederic.Giroire/
Place of the project:
Address: Inria, 2004 route de Lucioles, SOPHIA ANTIPOLIS
Team: COATI (common project Inria/I3S)
Web page: https://team.inria.fr/coati/
Pre-requisites:
Knowledge in networking and machine learning.
Python.
Description:
Context:
The exponential advances in Machine Learning (ML) are leading to the deployment of Machine Learning models in constrained and embedded devices, to solve complex inference tasks. At the moment, to serve these tasks, there exist two main solutions: run the model on the end device, or send the request to a remote server. However, these solutions may not suit all the possible scenarios in terms of accuracy or inference time, requiring alternative solutions.
Cascade inference is an important technique for performing real-time and accurate inference given limited computing resources such as MEC servers. It combines more than two models to perform inference: a highly-accurate but expensive model with a low-accuracy but fast model, and determines whether the expensive model should make a prediction or not based on the confidence score of the fast model. A large pool of works exploited this solution. The first ones to propose a sequential combination of models were [1] for face detection tasks, then, in the context of deep learning, cascades have been applied in numerous tasks [2,3].
Goal:
Our project is to use cascade models in the context of Edge Computing to improve the delay and reduce the resource usage of ML inference tasks at the edge. Of crucial importance for cascade models is the confidence of the fast model. Indeed, if the prediction of the first model is used but wrong, it may lead to a low accuracy of the cascade model, even if the accuracy of the best model is very high. Similarly, if the first model confidence is set too low, it will never be used, and the computations will be higher than using only the second model by itself, additionally, we will use unnecessary network resources and have higher deals than necessary. Researchers have proposed methods to calibrate such systems [4]. However, they have not explored the choice of the loss function of such systems in depth.
In this project, we will explore the impact of the choice of the loss function of the fast models of cascade networks. Indeed, such networks do not have the same goal as the global system, as they should only act as a first filter.
This project will focus on classification tasks.
Bibliography:
[1] Viola, P., & Jones, M. (2001, December). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001 (Vol. 1, pp. I-I). Ieee.
[2] Wang, X., Kondratyuk, D., Christiansen, E., Kitani, K. M., Alon, Y., & Eban, E. (2020). Wisdom of committees: An overlooked approach to faster and more accurate models. arXiv preprint arXiv:2012.01988.
[3] Wang, X., Luo, Y., Crankshaw, D., Tumanov, A., Yu, F., & Gonzalez, J. E. (2017). Idk cascades: Fast deep learning by learning not to overthink. arXiv preprint arXiv:1706.00885.
[4] Enomoro, S., & Eda, T. (2021, May). Learning to cascade: Confidence calibration for improving the accuracy and computational cost of cascade inference systems. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 8, pp. 7331-7339).
Title: Investigating Vulnerabilities and Opportunities of GPT‑based AI Assistants in the Peer‑Review Process
Name: Chuan Xu
Mail: chuan.xu@inria.fr
Web page: sites.google.com/view/chuanxu
Place of the project: Inria
Address: 2004, route des Lucioles, Valbonne, France
Team: COATI
Webpage: https://team.inria.fr/coati/
Pre-requisites : Experience of using machine learning programming framework such as PyTorch, Familiar with Hugging face
Description:
The machine‑learning community is increasingly adopting AI review assistants to cope with a growing volume of submissions and with reviews produced under time pressure. These systems can improve throughput and consistency, but they may also introduce new vulnerabilities (e.g., being misled by adversarial text) or hidden biases. This PFE studies both risks and potential improvements introduced by GPT‑style review aids.
Objectives
1. Map and synthesize the current AI‑assisted peer‑review workflows and the relevant literature.
2. Empirically evaluate how GPT‑based reviewers respond to changes in a paper’s content and structure.
3. Assess vulnerabilities and possible misuse (e.g., hiding inconsistencies, “hacking” paper wording to obtain better reviews).
Research Questions :
· Which sections of a paper (abstract, introduction, results, conclusion, figures) most strongly influence GPT‑based review judgments?
· To what extent can inconsistencies or flawed arguments be concealed from GPT‑based reviewers while still appearing convincing?
· Can small, targeted changes in phrasing or structure increase the likelihood of a favorable AI review for unchanged scientific content?
References:
Overview of AI Review System in AAAI 2026 https://aaai.org/wp-content/uploads/2025/08/FAQ-for-the-AI-Assisted-Peer-Review-Process-Pilot-Program.pdf
# What?
**Title:** Federated Learning with Heterogeneous Architectures
---
# Who?
**Name:** Giovanni Neglia, Francesco Diana
**Mail:** giovanni.neglia@inria.fr, francesco.diana@inria.fr
**Web page:** [http://www-sop.inria.fr/members/Giovanni.Neglia/](http://www-sop.inria.fr/members/Giovanni.Neglia/)
---
# Where?
**Place of the project:** Inria
**Address:** 2004 route des Lucioles, 06902 Sophia Antipolis
**Team:** NEO team
**Web page:** [https://team.inria.fr/neo/](https://team.inria.fr/neo/)
**Pre-requisites if any:**
The ideal candidate should like math and analytical reasoning and have strong programming skills.
A background on machine learning would be a plus.
---
# Description
Federated Learning (FL) is a machine learning paradigm where a population of clients collaboratively trains a global model without exchanging their private data. Instead, each client trains a local model on its dataset and only sends updates to a central server, which aggregates them to produce an improved global model. The original FL framework was introduced in [1].
Standard FL assumes that all clients train the **same model architecture**. In practice, however, clients are highly heterogeneous: some may be smartphones with limited memory and computation, while others may be powerful servers or desktops. This system heterogeneity makes it difficult to impose a single model architecture across all devices.
Recent approaches, such as **HeteroFL** [2] and **FjORD** [3], address this challenge by training **nested submodels** of different widths: smaller clients work on lightweight models, while larger clients train wider ones. Other methods, such as **FedProto** [4], aggregate information across heterogeneous models using prototypes rather than direct parameter averaging.
In this preparatory project, we want to:
1. **Survey** the main techniques proposed in the literature for federated learning with heterogeneous architectures.
2. **Reproduce a simple experiment** from HeteroFL or FjORD on a dataset such as CIFAR-10 or MNIST.
3. **Analyze** how accuracy varies with submodel size and resource constraints.
The goal is not to develop new methods, but to provide a first understanding of existing solutions and hands-on experience. This work will prepare the ground for a longer internship and potentially an industrial PhD thesis with Scaleway [6] on advanced algorithms for model-heterogeneous FL.
---
# Useful Information / Bibliography
[1] H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Aguera y Arcas. *Communication-Efficient Learning of Deep Networks from Decentralized Data.* AISTATS 2017.
[2] Enmao Diao, Jie Ding, Vahid Tarokh. *HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients.* ICLR 2021.
[3] Samuel Horváth, Lê-Nguyên Hoang, Marco Canini, Peter Richtárik. *FjORD: Fair and Accurate Federated Learning under Heterogeneous Targets with Ordered Dropout.* NeurIPS 2021.
[4] Y. Tan, G. Long, J. Jiang, T. Zhou, X. Zhang. *FedProto: Federated Prototype Learning across Heterogeneous Clients.* AAAI 2022.
[5] Mang Ye, et al. *Heterogeneous Federated Learning: State-of-the-art and Research Challenges.* ACM Computing Surveys, 2023.
[6] Scaleway, https://www.scaleway.com/
# What?
**Title:** Privacy in Split Learning
---
# Who?
**Name:** Giovanni Neglia, Jingye Wang
**Mail:** giovanni.neglia@inria.fr, jingye.wang@inria.fr
**Web page:** [http://www-sop.inria.fr/members/Giovanni.Neglia/](http://www-sop.inria.fr/members/Giovanni.Neglia/)
---
# Where?
**Place of the project:** Inria
**Address:** 2004 route des Lucioles, 06902 Sophia Antipolis
**Team:** NEO team
**Web page:** [https://team.inria.fr/neo/](https://team.inria.fr/neo/)
**Pre-requisites if any:**
The ideal candidate should have strong programming skills (Python, PyTorch/TensorFlow).
An interest in machine learning and data privacy would be a plus.
---
# Description
Split Learning (SL) is a distributed training paradigm where a model is split between a client and a server.
The client executes the first layers and sends the resulting activations (the “smashed data”) to the server,
which completes the forward pass and backpropagation, returning gradients to the client.
SL was introduced in [1] as a way to protect privacy by avoiding the transfer of raw data.
However, recent works show that intermediate activations may still leak information, enabling reconstruction of client data [2,3].
In this short preparatory project, we want to explore a simple idea: **what if the client does not reveal the weights of its local head?**
If the server cannot align with the client’s architecture, reconstruction attacks may become less effective.
The student will:
1. **Survey briefly** the main privacy attacks on split learning.
2. **Set up a simple split learning pipeline** (e.g., on MNIST or CIFAR-10).
3. **Reproduce a basic reconstruction attack** from the literature.
4. **Test a hidden-head variant** where the client’s local layers remain private, and compare reconstruction quality with the baseline.
The goal is not to provide a new defense, but to collect **initial experimental evidence** on whether keeping the client head private can reduce leakage.
This short project will prepare the ground for a longer internship and potentially an industrial thesis with Hivenet [5].
---
# Useful Information / Bibliography
[1] P. Vepakomma, O. Gupta, T. Swedish, R. Raskar. *Split learning for health: Distributed deep learning without sharing raw patient data.* arXiv:1812.00564, 2018.
[2] X. Xu et al. *A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning (FORA).* CVPR 2024.
[3] X. Zhu et al. *Passive Inference Attacks on Split Learning via Adversarial Representation Alignment.* NDSS 2025.
[4] P. Vepakomma et al. *NoPeek: Information leakage reduction to share activations in distributed deep learning.* ICDMW 2020.
[5] Hivenet, https://www.hivenet.com
Title: Modelling the tradeoff accuracy vs power consumption of reasoning AI models in test time
Advisor: APARICIO PARDO Ramon, VANDI Anna
Mail: raparicio@i3s.unice.fr
Web page: http://www.i3s.unice.fr/~raparicio/
Place of the project:
Address: I3S: Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis
2000, route des Lucioles - Les Algorithmes - bât. Euclide B, 06900 Sophia Antipolis
Team: Signet
Web page: http://signet.i3s.unice.fr
Pre-requisites if any:
- Python language (absolutely)
- Deep Learning libraries (like TensorFlow [6], Keras, rllab, OpenAI Gym) (recommended)
Theory:
- Machine Learning, Data Science, particularly Neural Networks theory (recommended)
- System adiministration
Description:
Novel reasoning AI models suppose an evolution and an specialisation of the LLMs. In a few words, a reasoning model [1] is a LLM that (i) excel at more complex reasoning tasks, such as solving puzzles, riddles, and mathematical proofs; and (ii) include a “thought” or “thinking” process as part of their response, that is, the so-called chain of thought (CoT): the sequence of Intermediate steps required to get the final answer. Then, when these models make use of longer chains of thoughts, they are supposed to provide “better solutions”: “the more you think, the better you think.” But, this additional “thinking” (generally known as Test-Time Compute, TTC) will come with an additional energy consumption, since more inference time in AI model, implies more energy consumption [2].
At the same time, we know [2,3], that the accuracy of computer vision AI models follows a logarithmic-like function, i.e., a function with diminishing marginal gains on accuracy when we increase the inference time (the total number of operations to do) (see Figure 1 in [3]). This is an interesting feature, because it means that you can slightly degrade the accuracy by cutting significantly the inference time of the model. We wonder if a similar feature will be present in the trade-off accuracy-inference time of these reasoning models. If such feature is confirmed, AI tasks could be classified into the category of compressible tasks [3].
The objective of the project is to find a mathematical equation showing how model's accuracy increases when additional computation during the inference (test) is allocated (test-time scaling, TTS), and to confirm or not if such equation follows a logarithmic-like function. To do that, the student will need to :
(1) install different reasoning and generative AI models in our servers;
(2) evaluate their performances using standardized benchmarks;
(3) perform a campaign of experiments to measure or estimate the power usage directly on the GPU and the PDU.
This project can be pursued as an internship.
Useful Information/Bibliography:
[1] https://sebastianraschka.com/blog/2025/understanding-reasoning-llms.html
[2] Section 3, Tiago da Silva Barros, Davide Ferre, Frederic Giroire, Ramon Aparicio-Pardo, Stephane Perennes. Scheduling Machine Learning Compressible Inference Tasks with Limited Energy Budget. ICPP 2024 - 53rd International Conference on Parallel Processing, Aug 2024, Gotland, Sweden. pp.961 - 970, https://hal.science/hal-04676376
[3] Tiago da Silva Barros, Frédéric Giroire, Ramon Aparicio-Pardo, Stephane Perennes, Emanuele Natale. Scheduling with Fully Compressible Tasks: Application to Deep Learning Inference with Neural Network Compression. CCGRID 2024 - 24th IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing, IEEE/ACM, May 2024, Philadelphia, United States. https://hal.science/hal-04497548
Title: Evaluating Perturbation Strategies Against Data Reconstruction Attacks in Federated Learning
Name: Francesco Diana, Chuan Xu
mail: francesco.diana@inria.fr, chuan.xu@inria.fr
Webpage: sites.google.com/view/chuanxu
Place of the project: Inria
Address: 2004, route des Lucioles, Valbonne, France
Team: COATI
Webpage: https://team.inria.fr/coati/
Pre-requisites : We are looking for a candidate with good analytical skills and coding experience in PyTorch.
Description:
Federated Learning (FL) [1] is a distributed paradigm designed to enable collaborative training of machine learning (ML) models while protecting user privacy by keeping sensitive training data locally on edge devices.
However, the assumption of strong privacy protection in FL is severely challenged by Gradient Inversion Attacks (GIAs) [2][3][4], also known as Data Reconstruction Attacks (DRAs)[5]. These attacks allow an adversary to recover sensitive training samples from the gradients or model updates exchanged by clients.
Two primary challenges arise when developing mitigation strategies:
1. Privacy Protection: ensuring that highly sensitive training samples are unrecoverable by the adversary.
2. Utility Preservation: guaranteeing that the defense mechanism does not significantly degrade model performance, increase training time, or compromise convergence.
Traditional methods, such as adding noise to gradients guided by Differential Privacy (DP) [6], have been widely explored [7]. Unfortunately, to achieve meaningful privacy guarantees, DP often requires adding a significant magnitude of noise, which severely impairs the final model accuracy, leading to an unsatisfactory trade-off between privacy and usability.
Hence, there is a pressing need for innovative defense mechanisms that actively protect data leakage without compromising model utility.
The goal of this project to explore and evaluate alternative defense mechanisms against DRAs that rely on modifications of data [8] or model representations/parameters [9][10][11] while preserving model accuracy.
For this internship, we expect the student to:
- Familiarize himself/herself with Federated Learning, attack threat models and Data Reconstruction Attacks.
- Implement and evaluate existing defense methods against different DRAs.
References:
[1] McMahan et al, Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017, pages 1273-1282
[2] Geiping et al, Inverting Gradients – How Easy is it to Break Privacy in Federated Learning?, NeurIPS 2020, pages 1421–1431
[3] Boenisch et al, When the Curious Abandon Honesty: Federated Learning Is Not Private, IEEE EuroS&P 2023, pages 175–199
[4] Carletti et al, SoK: Gradient Inversion Attacks in Federated Learning, USENIX Security 2025
[5] Diana et al, Cutting Through Privacy: A Hyperplane-Based Data Reconstruction Attack in Federated Learning, UAI 2025, pages 44–65
[6] Dwork and Roth, The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science 2014, pages 211–407
[7] Wei et al, Federated Learning With Differential Privacy: Algorithms and Performance Analysis, IEEE TIFS 2020, pages 3454–3469
[8] Sun et al, Soteria: Provable Defense against Privacy Leakage in Federated Learning from Representation Perspective, CVPR 2021, pages 9307–9315
[9] Jeter et al, OASIS: Offsetting Active Reconstruction Attacks in Federated Learning, IEEE ICDCS 2024, pages 1004–1015
[10] Scheliga et al, PRECODE: A Generic Model Extension to Prevent Deep Gradient Leakage, IEEE WACV 2022, pages 3605–3614
[11] Zhang et al, CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling, NDSS 2025
Supervisors: Emanuele Natale and Frederic Giroire
Location: COATI team (joint project-team between the Inria centre at Université Côte d’Azur and the I3S laboratory)
External collaborators: Aurora Rossi (University of Bonn, Germany)
Description:
Centrality measures quantify the importance of nodes within a network and are widely used in fields such as social network analysis, biology, transportation, and neuroscience. However, computing some of these measures, such as betweenness centrality, is computationally expensive for large graphs.
Recent works have proposed methods to approximate such ranking metrics using Graph Neural Networks (GNNs), which avoid repeated shortest-path computations and significantly improve scalability.
This project aims to investigate whether simpler and more lightweight machine learning models can achieve comparable approximations of centrality measures, or directly predict the top-k most important nodes. The student will design and evaluate these alternative approaches, comparing their accuracy, interpretability, and computational efficiency against existing GNN-based techniques.
Pre-requisites: Basic knowledge of graph theory and programming experience in Python or Julia.
References:
Maurya, S. K., Liu, X., & Murata, T. (2021). Graph Neural Networks for Fast Node Ranking Approximation (approximating betweenness and closeness centrality). ACM Transactions on Knowledge Discovery from Data, 15(5), Article 78.
Mirakyan, M. (2021). Abcde: Approximating betweenness-centrality ranking with progressive-dropedge. PeerJ Computer Science, 7, e699.
N. Meghanathan and X. He, “Correlation and Regression Analysis for Node Betweenness Centrality,” IJFCST, vol. 6, no. 6, pp. 01–20, Nov. 2016, doi: 10.5121/ijfcst.2016.6601.
C. Li, Q. Li, P. Van Mieghem, H. E. Stanley, and H. Wang, “Correlation between centrality metrics and their application to the opinion model,” Eur. Phys. J. B, vol. 88, no. 3, p. 65, Mar. 2015, doi: 10.1140/epjb/e2015-50671-y.
Advisor: APARICIO PARDO Ramon
Mail: raparicio@i3s.unice.fr
Telephone: 04 92 94 27 72
Web page: http://www.i3s.unice.fr/~raparicio/
Place of the project: I3S: Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis
Address:2000, route des Lucioles - Les Algorithmes - bât. Euclide B, 06900 Sophia Antipolis
Team: SIGNET
Web page: http://signet.i3s.unice.fr
Pre-requisites if any:
Languages:
- Python language (absolutely)
- Deep Learning libraries (like TensorFlow [6], Keras) (recommended)
Theory:
- Machine Learning, Data Science, particularly Neural Networks theory (recommended)
- Classical optimisation theory (Linear Programming, Dual Optimisation, Gradient Optimisation, Combinatorial Optimization, Graph Theory) (recommended)
Technology:
- Computer networking (recommended)
- Quantum computing and networking (not necessary but convenient)
Description:
In the long term, Quantum Communications promise to connect Quantum Processors placed at remote locations, giving rise to Quantum Cloud able to perform very complicated computation tasks in very shorter processing times. In the short term, Quantum Communications are applied in tasks such as cryptography key distribution or clock synchronization [1]. In both cases, the basic “operation” necessary to carry out is the quantum entanglement distribution between the source and the destination of the communication. That is equivalent to packet forwarding in classical packet switched networks, and renamed as quantum datagram forwarding.
In this project, we aim to study the optimization of quantum (q-)datagram forwarding in connectionless quantum networks [2], that necessitates dynamic, real-time control strategies capable of managing inherent probabilistic processes and decoherence effects. Within a connectionless quantum network, the forwarding of a quantum request is abstracted into a series of sequential decisions made by intermediate nodes (quantum repeaters). These decisions focus primarily on optimizing the management of quantum resources, including when to attempt probabilistic operations like entanglement generation, when to execute entanglement swapping to extend the path, and how to utilize limited quantum memory. The control problem centers on selecting an optimal policy that maximizes an objective (e.g., fidelity or rate) under real-time constraints.
These management decisions (i.e., control policies) constitute stochastic control problems, that can be modelled as Markov Decision Processes (MDPs) and solved under the Reinforcement Learning (RL) framework (a form of Machine Learning) [3-5].
In this project, we aim to find optimal control policies for q-datagram forwarding.
To do that, we will follow the next steps:
1. Model the q-datagram forwarding as a Markov Decision Process.
2. Develop a simulator that can serve as RL environment based on MDP modelling of the previous step.
3. Implement a naive policy and test it in the simulator.
The project can be followed by an internship.
Useful Information/Bibliography:
[1] "Quantum Networks: From a Physics Experiment to a Quantum Network System" with Stephanie Wehner
: https://www.youtube.com/watch?v=yD193ZPjMFE
[2] Bacciottini, L., De Andrade, M. G., Pouryousef, S., Van Milligen, E. A., Chandra, A., Panigrahy, N. K., ... & Towsley, D. (2024). Leveraging Internet Principles to Build a Quantum Network. arXiv preprint arXiv:2410.08980.
[3] S. Khatri, “Towards a General Framework for Practical Quantum Network Protocols,” LSU Doctoral Dissertations, Mar. 2021, [Online]. Available: https://digitalcommons.lsu.edu/gradschool_dissertations/5456
[4] S. Khatri, “Policies for elementary links in a quantum network,” Quantum, vol. 5, p. 537, Sep. 2021, doi: 10.22331/q-2021-09-07-537.
[5] Yau, G. X., Burushkina, A., da Silva, F. F., Maji, S., Thomas, P. S., & Vardoyan, G. (2025). Reinforcement Learning for Quantum Network Control with Application-Driven Objectives. arXiv preprint arXiv:2509.10634.
Name: Frédéric Giroire et Joanna Moulierac
Mail: frederic.giroire@inria.fr
Web page: https://www-sop.inria.fr/members/Frederic.Giroire/
Place of the project:
Address: Inria, 2004 route de Lucioles, SOPHIA ANTIPOLIS
Team: COATI (common project Inria/I3S)
Web page: https://team.inria.fr/coati/
Pre-requisites:
Knowledge in machine learning and Pytorch.
Description:
As artificial intelligence (AI) deployment scales rapidly, its energy demands and associated carbon emissions are drawing increasing scrutiny. Among AI systems, large language models (LLMs) stand out for their particularly high energy intensity in both training and inference phases. For example, training GPT-3 (175 billion parameters) has been estimated to consume ~1,287 MWh of electricity and emit ~502 metric tons of CO₂. More recent benchmarking work on inference further shows that, depending on prompt length, model architecture, and deployment environment, a single long‐prompt query may consume tens of watt-hours—leading to substantial aggregate environmental impact when scaled across millions of queries per day.
Despite this, there is currently no reliable, generalizable model that can accurately estimate an LLM’s energy consumption across architectures, hardware, and usage patterns. To fill this gap, our project proposes a bottom-up modeling approach: decompose ML architectures into their constituent blocks and layers, characterize the energy profile of each module (e.g. attention, feed-forward, embedding, normalization), and reassemble these into a compositional model that predicts total energy usage (and hence emissions) for a given architecture, input workload, and hardware configuration.
By grounding our model in empirical measurements and architectural decomposition, we aim to deliver not just coarse estimations, but precise, component-level accountability—enabling architects and users alike to trade off performance, cost, and environmental footprint in LLM design.
Contact and supervision
Benoît Miramond, Robert De Simone
LEAT Lab – University Cote d’Azur / CNRS
INRIA Meditérannée
Campus SophiaTech
04.89.15.44.39.
benoit.miramond@univ-cotedazur.fr
robert.de_simone@inria.fr
Practical information
Location : LEAT Lab – INRIA / SophiaTech Campus, Sophia Antipolis
Profile : embedded programming, micro-controller, sensors, wireless communication, machine
learning
Research keywords : Embedded systems, signal processing, Edge AI, wireless sensor networks
Context
LEAT lab has been working for several years on the design of autonomous sensor networks. These
wireless sensors embed compressed models of artificial neural networks for pattern detection in data
extracted from the sensors. The case study of interest in this internship concerns the evamuation of
biodiversity through bird songs detection for the continuous census of species nesting in a given
ecosystem. Achieving this goal requires solving several problems related to different disciplines:
embedded software code compactness, ultra-low-power electronics design for wireless
communication, embedded AI training and design, and formalization of system states between energy
recovery, computation, LoRA communication, standby, and wake-up conditions [4-7].
The project is funded by the European CERN center where the connected birdhouses have to be
deployed.
The aim of this internship will be organized in 3 phases.
The first one consists in modeling and develop the embedded software that controls the current
prototype of autonomous sensor node capable of detecting bird songs in an outdoor environment.
Once the different modes and states of the system formalized, the aim will be to develop/adapt the
corresponding on-board software that guarantee the functional behavior in each state of the system.
The system should only wake up and communicate occasionally, to save battery power as much as
possible.
The second phase of the project will consist in training an embedded AI model to fit the last data
collected on site and confront the results to existing works in the literature [1-3, 7]. The model should
maximize the accuracy of detection while minimizing the memory footprint in the MCU.
Finally, the last phase will focus on the evaluation of the different metrics enabling the quality of the
proposed solution to be assessed, both on the embedded part and the server part: real-time behavior,
power consumption, lifetime, quality of the detections sent back to the server.
The resulting device will be experimented in real conditions first on the SophiaTech campus listed by
the LPO (Lige de Protection des Oiseaux), and on the CERN campus in Geneva.
Project mission
The project mission will be organized in several periods:
• Bibliographic study on detection of bird songs with wireless sensors [1-7]
• Introduction to the existing Sw and Hw solutions developed at LEAT for Edge AI applications [8]
• Modeling and developing the embedded software
• Design and train a new Edge AI solution
• Evaluation of performances
• Experimentations and validation of the system into real environments
• Publication in an international conference
References
[1] M. Kumar, A. K. Yadav, M. Kumar, and D. Yadav, “Bird species classification from images using deep
learning,” in International Conference on Computer Vision and Image Processing. Springer, 2022, pp. 388–401
[2] J. S. Cole, N. L. Michel, S. A. Emerson, and R. B. Siegel, “Automated bird sound classifications of long-
duration recordings produce occupancy model outputs similar to manually annotated data,” Ornithological
Applications, vol. 124, no. 2, p. duac003, 2022
[3] W. K. Michener, “Ecological data sharing,” Ecological informatics, vol. 29, pp. 33–44, 2015
[4] Bird@ Edge: Bird Species Recognition at the Edge, J Höchst, H Bellafkir, on Networked Systems, 2022
[5] X Han, Bird sound detection based on sub-band features and the perceptron model, Applied Acoustics, 2024
[6] Acoupi: An Open-Source Python Framework for Deploying Bioacoustic AI Models on Edge Devices, 2025
[7] TinyChirp: bird song recognition using TinyML models on low-power wireless acoustic sensors, 2024
[8] Quantization and Deployment of Deep Neural Networks on Microcontrollers, 2022
Who?
Name: Fabrice Huet, Dino Lopez Pacheco
Mail: fabrice.huet@univ-cotedazur.fr, dino.lopez@univ-cotedazur.fr
1 Introduction
Processing network packets is considered a heavy task. In a standard system, packets received by a NIC must go through a very large process, traversing the entire protocol network stack and multiple filtering, security, monitoring services implemented at the server’s kernel (potentially multiple times when tunneling is employed) before reaching the targeted application [1,2]. All this processing can introduce undesirable delays and jitters for delay-sensitive applications (e.g. online gaming, live video streaming, etc.).
eBPF (extended BPF) has been introduced into the Linux kernel, to run custom programs at special “hook points”, notably at the earliest stage of the packet’s path into the kernel, inside the network driver, at the XDP (eXpress Data Path) hook. The framework also enables the packet to be directly sent from XDP to user space via a socket named AF_XDP.
eBPF has been widely leveraged by cloud providers to enhance the performance of multiple networking services (firewalls, load balancer, etc.) and the interconnection between users’ functions.
At the I3S laboratory, the SigNet and Scale teams have been working on the study and analysis of the benefits of XDP [3]. In our current ongoing work, a new solution is under design to improve the performance of Cilium, a networking plugin for Kubernetes, which relies on eBPF to increase the network performance.
2 PFE objectives
In this project, the students will oversee the creation of a comprehensive state of the art of the multiple ways the scientific networking community is employing eBPF to increase the performance of the network, the systems and the applications.
In parallel, the student will be required to deploy and test under multiple system configurations of the customized Cilium plugin and improve its robustness and compatibility.
Last, we would like to explore the benefits of XDP, if any, from the energy point of view. Hence, an in-depth analysis of the energy consumption needs to be carried out by deploying the same cloud scenario and application with Kubernetes Cilium, our customized Cilium plugin, but also some other network plugins.
3 Required skills
· Good knowledge of C programming
· Linux administration
· Good knowledge of networking
4 Bibliography
[1]. Zhuo D et al., “Slim: OS Kernel Support for a Low-Overhead Container Overlay Network”. 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19), Boston, MA, USA.
[2]. Jiaxin Lei et al. "Tackling parallelization challenges of kernel network stack for container overlay networks”. In Proceedings of the 11th USENIX Conference on Hot Topics in Cloud Computing (HotCloud'19). USENIX Association, USA, 9.
[3]. K. C. du Perron, D. L. Pacheco and F. Huet, "Understanding Delays in AF_XDP-based Applications," ICC 2024 - IEEE International Conference on Communications, Denver, CO, USA, 2024, pp. 5497-5502, doi: 10.1109/ICC51166.2024.10622351.
What?
Title:Measuring the impact of climate change on telecommunications networks
Who?
Name: Dino Lopez-Pacheco et Guillaume Urvoy-Keller
Mail: dino.lopez@univ-cotedazur.fr et guillaume.urvoy-keller@univ-cotedazur.fr
Web page: https://webusers.i3s.unice.fr/~lopezpac/#studs et https://webusers.i3s.unice.fr/~urvoy/
Where?
Place of the project: I3S
Address: Les Algorithmes B, 2000 route des Lucioles
Team: Signet
Web page: https://webusers.i3s.unice.fr/~lopezpac/#studs et https://webusers.i3s.unice.fr/~urvoy/
Pre-requisites if any:
Description:
*Background*: Climate change both promotes extreme events such as fires and floods, and alters the average values of certain physical phenomena such as sea level rise.
Several research articles have begun to:
Measure the current extent of these phenomena, such as [1] that measured the impact of weather conditions on the access network (mobile and FTTH/ADSL) over 8 years in the United States
Anticipate the level of network exposure to future phenomena such as rising sea levels on coastal network infrastructure [2]
At the I3S laboratory, we have begun extracting climate data and mobile network status to monitor the impact of climate change on communication services in the Alpes-Maritimes department and understand current and future trends.
*Project Objective*: In this project, we aim to build a web interface capable of graphically displaying the collected data in a highly flexible manner.
For example, via a web interface, it would be possible to request the creation of a graphic display showing the average, minimum, and maximum temperatures for a city or department, whether over a week, a month, or another timescale. Or, alternatively, the number of telecommunications network outages that occurred over a certain time period and for very specific locations.
It will also be necessary to map the location (e.g., on an OpenStreet map) where the data is displayed graphically.
Finally, this web interface must be able to export images in several formats (mainly PNG, SVG, and PDF).
While designing the interface, it will be also necessary to analyse the data so as to assess the amount of correlation that exists between the two sources of data.
Skills:
Web development knowledge
Python programming
Bash programming
Computer networking knowledge
Useful Information/Bibliography:
[1] Padmanabhan, Ramakrishna, et al. "Residential links under the weather." Proceedings of the ACM Special Interest Group on Data Communication. 2019. 145-158.
[2] Durairajan, Ramakrishnan, Carol Barford, and Paul Barford. "Lights out: Climate change risk to internet infrastructure." Proceedings of the Applied Networking Research Workshop. 2018.
[3] Météo France: https://meteo.data.gouv.fr/
[4] RIPE Atlas here: https://atlas.ripe.net/coverage/
[5] ARCEP: https://arcep-dev.github.io/siteshs/#7/45.368/2.208
Who?
Name: Frédéric MALLET, Robert De Simone
Mail: Frederic.Mallet@univ-cotedazur.fr / robert.de_simone@inria.fr
Web page: https://www-sop.inria.fr/members/Frederic.Mallet/
Where?
Place of the project: Inria Lagrange
Address: 2000 route des Lucioles
Team: Kairos (i3S/Inria)
Web page: https://team.inria.fr/kairos/
Description
The context of the work is the ANR project TAPAS (https://frederic-mallet.github.io/anr-tapas/) that aims at adding a time theory to Event-B. As a first step, we should develop a theory for the Clock Constraint Specification Language (CCSL) and encode it in Event-B. The goal is to be able to reason formally, with a proof assistant, within the framework Rodin, on CCSL specifications. As some "unsafe" CCSL operators include unbounded integers, we need the recursion (embedded in Event-B) to reason about it. Once the theory is embedded, we would study basic properties of CCSL specifications.
As the state of the art, we should investigate the existing timed and temporal models that are already embedded in proof assistants. The goal is to have a comprehensive view of time embedding in proof assistant, not only Event-B. For instance there is some work on reasoning on a timed version of algebraic state-transition diagrams (T-ASTD).
The work plan is the following :
state of the art of embedding temporal or timed logics in proof assistants (Event-B, Isabelle, Coq, Agda).
propose a theory that can encode logical clocks and in particular those of CCSL
encode this theory in Event-B
implement, using the theory plugin of Rodin, the developed theory. Use this theory to establish some basic properties on CCSL ()
as an additional work, we can look at embedding a theory for the Linear Temporal Logics (LTL). This would allow a reflection on the mixing of both theories for a single model.
Bibliography :
Grygoriy Zholtkevych: Coalgebraic Semantic Model for the Clock Constraint Specification Language. FTSCS 2014: 174-188
Jean-Raymond Abrial: Modeling in Event-B - System and Software Engineering. Cambridge University Press 2010
Yamine Aït Ameur, Guillaume Dupont, Ismaïl Mendil, Dominique Méry, Marc Pantel, Peter Riviere, Neeraj Kumar Singh: Empowering the Event-B Method Using External Theories. IFM 2022: 18-35
Amir Pnueli: The Temporal Logic of Programs. FOCS 1977: 46-57
Diego de Azevedo Oliveira, Marc Frappier: TASTD: A Real-Time Extension for ASTD. ABZ 2023: 142-159
Marc Frappier, Frédéric Gervais, Régine Laleau, Benoît Fraikin, Richard St-Denis: Extending statecharts with process algebra operators. Innov. Syst. Softw. Eng. 4(3): 285-292 (2008)
Frédéric Mallet, Robert de Simone: Correctness issues on MARTE/CCSL constraints. Sci. Comput. Program. 106: 78-92 (2015)
Frédéric Mallet: Clock constraint specification language: specifying clock constraints with UML/MARTE. Innov. Syst. Softw. Eng. 4(3): 309-314 (2008)
Name: Xavier Descombes
Mail: xavier.descombes@inria.fr
Web page: https://www-sop.inria.fr/members/Xavier.Descombes/
Place of the project: I3S
Address: sophia antipolis
Team: morpheme
Web page: https://team.inria.fr/morpheme/
Pre-requisites if any:
Python programming (pytorch, tensorflow)
Description:
Anapathologists examine images with different staining to classify the patient desease in a specific cancer subtype, to grade the cancer or even to infer some prognostic.
AI has proved to be very efficient to help the pathologist in this task and have shown impressive results in the mentioned domains.
However, we face two major problems, The first one is linked to data variability between clinical centers, thereofre a multicentric model is still an open issue.
The second one is related to the different staining which are not always available depending on the clinical centers.
The aim of this work is then to evaluate generative models and in particular diffusion models to homogeneize data of different centers (color transfer) or to simulate missing
staining (domain transfer).