Please note that the interview process is currently paused. However, I always welcome resumes from motivated candidates, and I can accommodate exceptional applicants on a case-by-case basis (our doctoral school offers PhD grants for students with strong academic records from renowned institutions, as well as for candidates who already have significant research experience).
The topics presented below are indicative of the research currently conducted in the lab. The exact subject may vary, but it will likely be related. If you have a specific topic you would like to explore, I am open to discussion.
When contacting me, please address the following criteria, which I use to select students:
Non-coursework computer science projects. Have you worked on any projects outside formal coursework (e.g., building a website, maintaining IT infrastructure for your school, contributing to an open-source project, etc.)? The same apply if you work in a company: do you have side projects? These do not need to be large or ambitious, but if your only experience comes from school assignments or work, you are not a good fit. The projects do not need to be systems-related.
English proficiency. You should be able to sustain a technical conversation about your work. I have no formal certification requirement, but if you routinely struggle to find words or rely on a script to speak, collaboration will be difficult. As an informal benchmark: can you watch US/UK science-oriented YouTube videos without subtitles and understand them comfortably? Could you discuss them afterward?
Interest in operating systems internals. You do not need to be an expert, but you should be genuinely curious.
Obviously the list is not set in stone, but should give you a rough idea of the profiles I'm looking for. If you match these criteria, contact me!
If you have worked on some research projects before (paper submitted or accepted), please mention it, and please specify your role in the project.
Efficient Key-Value store for future hardware
The objective of the project is to design an efficient key-value store for future hardware, using fast NVMe SSDs and high speed networks.
In our modern digital life, activities from buying clothes to accessing government services depend on working with computer applications which store information in a long-lasting form, that is, they need a data store, where information (called ”values”) is found using a label or ”key”, So the key-value store is vital for the functionality and performance of all applications. Even relational databases typically contain a key-value store as a storage engine, underneath layers that support a richer query model.
The current key-value stores are designed for today’s typical hardware environment with a memory hierarchy of slow but capacious hard disk, flash-based SSD, and fast but limited and volatile RAM. Currently, no key-value store can leverage the speed of modern hardware -- existing key-value stores become CPU bound before reaching the limit of these devices.
The goal of this project is to rethink the way data is stored in memory and on disk to design a fast key-value store for modern drives and persistent memory. The project will start with a study of the characteristics of persistent memory. Then existing data-structures designed for volatile DRAM will be ported and optimized for persistent memory. Finally, new designs will be proposed.
Requirements: fluent C, C++, or Rust programming, notions of good practices in optimizations and efficient programming.
Detecting data races in the Linux kernel
The objective of the project is to design a new mechanism to specify and detect data races bugs in the Linux kernel.
The Linux kernel is a highly optimized multi-core system. As a consequence, kernel developers frequently use relaxed synchronization models: instead of protecting shared variables with locks, the code is carefully engineered so that variables can be read outside critical sections while still providing a consistent view of the data. For instance, the kernel often allocates an object, initializes it, and then inserts it in a list. The inserts to the list are protected by a lock, but the list can be read without taking any lock. The code is thus only correct if the inserted object has been fully and correctly initialized before being inserted in the list. The kernel relies on memory barriers to order the initialization and the insert.
However, memory barriers are notoriously tricky to use correctly. A quick search in kernel commits found hundreds of bugs related to missing barriers. These bugs are tricky to understand and to fix.
In this project, we want to create a tool that would help developers get formal guarantees on code that relies on ordering and barriers for correctness. The key idea of the project is to use existing comments in the kernel code to figure out ordering constraints between variables or functions. Kernel developers often comment on the intent of the code, and informally document the ordering constraints of their code. We want to use these comments to figure out if the commented constraints are respected throughout the entire kernel.
Requirements: notions of static analysis, ability to navigate in a large codebase (Linux kernel), fluent in Rust, good notions of C.