M. Sc. Thesis Projects
I am looking for self-motivated and ambitious students with a strong deadline-oriented profile and the desire to make an impactful thesis. Depending on the findings and results, you should be willing to write an accademic article for top venues. If you have this desire, I will help you reaching this goal. If you would like to do a thesis under my supervision send me the following information by e-mail: i) exams taken including those of the bachelor; ii) curriculum; iii) topic of interest; iv) preferred start and end date.
Below are listed some broad highlights of potential topic themes. To have a better idea give a look to what current and past students did or pass by to have a chat.
Requirements. These topics are tailored to students with a strong machine learning, network theory and deep learning background. Advanced skills in Python and Pytorch/Tensorflow are a must.
References. You can consult these references as a starting point:
https://arxiv.org/pdf/2001.07620.pdf
Past MSc theses: [Möllers2023] [Naber2023] [Iancu2020] [Sipko2020]
#1: Graph Neural Networks
Interconnected data arising from networks such as data from social, biological, molecular, sensor and financial networks come on massive amoint and carry hidden information in their structure that can be key to identify fake news, discover new drugs, identify malfunctioning sensor, and making useful recommendations. Sucessfully solving this task requires building learning tools capable of leveraging the underlying structure so as to map the information in meaningul embeddings.
Graph neural networks (GNNs) extend the sucess of deep learning from text, speech and images to graph-based data and allow extracting informtion in a similar layere fashion. They have shown wide sucess in all the aforementioned applications but there is still a lot to do to improve them further as well as our understanding on their inner-working mechanisms.
MSc thesis topics: Master projects in this category will concern foundamental research into the inner work mechanisms of GNNs. Particular areas include:
Dynamic graphs: Many real-world graphs change over time but yet current solutions sledomly work on this setting. In this project, you will focus on developing GNNs that can process data on graphs whose edges/nodes appear but could also focus on analysing the limitations of existing solutions.
Physic-informed GNNs: Current GNN solutions do not rely on any physical model to process their data. However, expoiting prior physical laws on graphs it is not only possible to build better GNNs solutions but to also provide a deeper understanding on their inner-workings. Topics on this thesis will center around the intersection of these two areas.
Self-supervised learning (SSL): SSL aims at leveraging unlabeled data to train a network. However, how to perform SSL on graphs is not well understood. Your task would be mainly centered on developing novel SSL approaches for graph learning.
#2: Machine Learning on Higher-Order Networks
Lots of real data have a complex structure that cannot be represented by graphs as the latter capture only pairwise interactions. Examples are flows (water, trafic) which could be seen as data associated to edges (pairs of nodes) but the same principle extends also to triples or multiple nodes (a user preference to a bulk of items or data associated to multiple brain regions). In these cases, extracting information requires leveraging this more complex structure that is typically represented via simplicial complexes or hypergraphs. However, how to best leverage the structure to devise machine learning solutions is largely unknown.
MSc thesis topic: In this project, you will focus on foundamental aspects of building machine/deep learning solutions for higher-order networks. You will focus in particular on understaing how to propagate information between the different topological levels and understaind how it impact the learning capabilities of the model.
Requirements. These topics are tailored to students with a strong machine learning, network theory and deep learning background. Ideally, you have followed the MSc course on Machine Learning for Graph Data and have a deep understanding of graph representation learning.
References. You can consult these references as a starting point:
Past MSc theses: [Möllers2023] [Liu2023]
Past MSc theses: [Liu2023]
Requirements. These topics are tailored to students with a strong machine learning, network theory and deep learning background. Advanced skills in Python and Pytorch/Tensorflow are a must.
References. You can consult these references as a starting point:
Past MSc theses: [Mazzola2020]
#3: Processing Networked Time Series
Time series on networks are commonly encountered in weather sensor networks, finance, and biological networks. Therefore, modeling and analyzing the effects of time varying network signals is a topic of high importance to detect malfunctioning sensors, predict stock prices as well as monitor biological activities. For example, we can represent sensors as nodes of a graph and communications links as edges. The temperature measured in each sensor is a signal residing on the nodes on this graph, where the signal temporal evolution is dictated by the underlying topology. However, the effects of a temperature change in adjacent nodes are difficult to model with conventional techniques. Recent techniques from graoh machine learning can be an effective way to process time varying signals on networks. These models exploit graph neural networks and the structure of the data over time to aid modeling.
MSc thesis topic: In this master project, your task is to model the time varying signals via graph machine learning models. Two important questions to answer are: i) How to build an effective graph structure for modeling the underlying network? ii) How to exploit this network and graph filters to model the signal temporal evolution? You will work on both theoretical and practical aspects of the project and will compare your algorithm with different baselines models. Topic inspiration from the research on GNNs can be adopted here but the dynamci nature of the data makes everything more intruguing as we need now to capture both the data relation over the graph and time.
Online learning: When the data come in a streaming fashion, we want to train our model on the fly. However, how to best incorporate the latter is not well-known. Also, we want to characterize the impact of this online learning on the overall performance.
Physic-based learning: In this project, we want to mix physical models with graph-time models to learn from the temporal evolution of the data. The mix physical models allow us to also obtain more interpretable and data efficient solutions. Characterizing the role of the latter is could also be an interesting thsis topic.
#4: Recommender Systems
In recommender systems (RecSys), we wish to predict whether a user is interested in a specific item in the e-commerce system. They have a widespread use in almost all digital aspects of our lives including both positive outcomes (good item recommendation in an e-platform) and negative outcomes (fake news recommendations). There are two key aspects in building sucessful recommender systems: i) capturing the underlying dependencies between users, items in the available scarce and noisy data; ii) incorporating the diverse and dynamic user preferences into the game. Quantifying and incorporating the latter has been a long standing problem in the RecSys research.
MSc thesis topics: Master thesis topics in this category revolve around the following streams:
Graph-based RecSys: We can represent users, items and their connections by a user-item graph or user-user social graph. Therefore, the recommendation problem can be converted to predicting user rating in the recommender network. Graph neural networks (GNNs) have shown remarkable results in predicting user ratings in recommender systems. We want to investigate how to incorporte the user dynamic preferences in graph-based RecSys, improve different criteria beyond accuracy (diversity, novelty etc.), build RecSys-tailored GNNs.
Green RecSys: One aspect of RecSys in e-commerce is that they contribute substantially to the environmental impact. In this direction, you can focus on measuring such an impact, providing methods to improve it as well as nudge users towards more sustainable choices.
Requirements. You should have a strong machine learning and data science background. In addition you need to have knowledge about multimedia search and recommendation.
References. You can consult these references as a starting point:
Past MSc theses: [Lodha2023] [Kalisvaart2022] [Chandrashekar2022] [Dahrs2022] [Pocchiari2020]
#5 Data-Driven Water Management System Operations
Urban water systems are facing growing pressure from climate change and increasing demographics, forcing cities to devise new approaches to ensure water supply, sanitation and flood risk management. The progressive digitisation of the water sector allows artificial intelligence to play an important role in meeting these challenges towards a sustainable future. While conventional machine and deep learning techniques are showing promise, they ignore the structure of the urban network infrastructure. Graph-based learning techniques can effectively take the complex interrelationships of water networks and urban systems into account, allowing the development of ground-breaking data-driven solutions. We would like to use these techniques in addressing some of the most pressing isuess in water management sytems. Some illustrative examples are listed below. Consult also the webpage of AidroLab@TU Delft.
Urban Drainage Modeling with Little Data
Introduction. Creating a data-driven model of a large urban drainage system is practically intractable. Machine and deep learning have reached outstanding performances in multiple areas. However, they rely on a large amount of information to be trained. Therefore, they are limited (or even restricted) when there are not many data points. Since data collection and curation is expensive, it is desirable to create a sufficiently performing model with the least amount of data possible.
The drainage system of the city in Eindhoven has few data points in comparison to its size. How to create a data-driven model that exploits the available information? From which point on is this possible? What are the limitations of such a model?
Objective. The objective of this project is to apply data-driven methods to create a model of a large urban drainage system. This is, given a limited amount of data, estimate the state (i.e., physical variables) in the system.
Requirements. Strong expertise in machine learning and programming skills in Python. Ideally some knowledge or interest in urban water engineering.
Data-Driven Adapation Techniques
Introduction. External pressures such as climate change, urbanisation, aging infrastructure, and others. An optimal drainage system copes with the current and uncertain future needs. However, are current solutions adaptable or do they sacrifice future performance for current gain? Previous studies have used automatic techniques such as evolutionary algorithms or graph methods to aid in the (re)design of the water networks. Recently, artificial intelligence methods, particularly machine and deep learning, have also shown great promise for similar tasks in multiple fields, including water resources.
Objective. The objective of this project is to apply classical and new optimization algorithms in urban drainage systems to develop alternatives for adaptation scenarios in a urban drainage system. Given an increase in the rainfall intensity or urban densification in the coming years, how to best improve the water network so that it keeps its current and future performance? This can also be explored under multiple climate change scenarios. This project entails designing, re-designing, simulating, and comparing performance of multiple solutions.
Requirements. Programming skills in Python, knowledge in machine learning and optimisation. Ideally some knowledge or interest in urban water engineering.
Other topics under this theme that can be approached via graph neural networks include:
Forecasting water demand: In water distribution networks, forecasting the water demand is crucial for assigning resources as well as a proper functionality of the network. However, current optimization-based approaches to solve this task are either simple or computationally expensive. To tacke both the latter we want to use graph neural networks that learn from historical data and network topology the demand pattern. In addition, we want to also merge these techniques with the physical equations of the water distribution network to aid learning and restrict further the parameter space.
Detecting leaking pipes. One main issue in water distribution networks (WDNs) is pipe leackage. However because of limited resources only a few sensors are put to monitor the pipe activites. These sensors and the water demand provide useful data that can be used to detect anomalies in the WDNs such as detecting pipes. We want to use graph-based machine learning techniques to detect such anomalous links from the partial measurements.
References. You can check this survery by members of AIdroLab as a starting point for all these topics: https://agupubs.onlinelibrary.wiley.com/doi/pdfdirect/10.1029/2021WR031808
Past MSc theses: [Solà Roca2023]