Thesis proposals

Here's a list of thesis proposed by the members of the lab. These are basically ideas from which develop a more structured work once the topic of interest is selected. Basic knowledge of deep learning and computer vision is required to engage with these topics effectively.

Inside the Machine's Vision: Mechanistic Interpretability to Enhance Deepfake Detection

Keywords: Deepfake detection, Mechanistic interpretability, Performance enhancement

Abstract: This thesis explores how to enhance the performance of VLM-based deepfake detectors. The goal is to detect deepfakes coming from the modern generative models, such as DALL-E3, Midjourney, and Stable Diffusion XL. To this end, mechanistic interpretability will be exploited to understand what are the most relevant model's components to act on.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)

References:

Yingdong Shi, Changming Li, Yifan Wang, Yongxiang Zhao, Anqi Pang, Sibei Yang, Jingyi Yu, and Kan Ren. Dissecting and mitigating diffusion bias via mechanistic interpretability. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 8192–8202, June 2025
Sohail Ahmed Khan and Duc-Tien Dang-Nguyen. 2024. CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection. In Proceedings of the 2024 International Conference on Multimedia Retrieval (ICMR '24). Association for Computing Machinery, New York, NY, USA, 1006–1015

Understanding Representations in Optimized Vision Transformers for Monocular Depth Estimation

Keywords: Information theory, Entropy estimation, Efficient architectures

Abstract: This thesis investigates how optimizing different parts of transformer-based models affects the quality of learned representations. Focusing on encoder or decoder embeddings in monocular depth estimation, the study compares encoder, decoder, and full-model optimization. Insights can be extended to different computer vision tasks.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)

References:

Kingma et al., Auto-Encoding Variational Bayes, 2013.
Tishby et al., Deep learning and the information bottleneck principle. 2015.

Learning to Detect Deepfakes in the Presence of Social Network Image Degradation

Keywords: Deepfake detection, Social network compression, Forensic traces

Abstract: This thesis investigates deepfake detection under real-world social network compression, where platform-specific encoding and block artifacts significantly distort forensic traces. We will explore both model-side robustness strategies (e.g., artifact-suppression and attention guidance) and data-side emulation techniques to reproduce realistic compression pipelines, aiming to improve generalization from lab conditions to in-the-wild content.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Simone Teglia (teglia@diag.uniroma1.it)

References:

Montibeller, A., Shullani, D., Baracchi, D., Piva, A., & Boato, G. (2025, October). Bridging the Gap: A Framework for Real-World Video Deepfake Detection via Social Network Compression Emulation. Proceedings of the 1st on Deepfake Forensics Workshop: Detection, Attribution, Recognition, and Adversarial Challenges in the Era of AI-Generated Media, 29–36. doi:10.1145/3746265.3759670
Li, M., Tao, R., Liu, Y., Tan, C., Qin, H., Li, B., … Zhao, Y. (2025). Pay Less Attention to Deceptive Artifacts: Robust Detection of Compressed Deepfakes on Online Social Networks. arXiv [Cs.CV]. Retrieved from http://arxiv.org/abs/2506.20548

Enhancing and Explaining Deep Learning Models for Periodontal Disease Assessment using Diffusion Models and XAI

Keywords: Diffusion Models, Explainable AI, Medical Imaging, Dental Radiography

Abstract: This thesis aims to advance deep learning for periodontal analysis by tackling data scarcity and model opacity. We will develop a generative framework to augment the training dataset, improving model robustness and accuracy. Concurrently, we will integrate explainable AI methods to support the models' diagnostic reasoning, ensuring their outputs are transparent, clinically relevant and trustworthy.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Gianmarco Scarano (gianmarco.scarano@uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)

References:

Marie, H.S., Elbaz, M., Soliman, R.s. et al. DentoMorph-LDMs: diffusion models based on novel adaptive 8-connected gum tissue and deciduous teeth loss for dental image augmentation. Sci Rep 15, 27268 (2025). https://doi.org/10.1038/s41598-025-11955-2
Glick, A., Clayton, M., Angelov, N., & Chang, J. (2022). Impact of explainable artificial intelligence assistance on clinical decision-making of novice dental clinicians. JAMIA Open, 5(2). Retrieved from https://www.scopus.com/pages/publications/85134949460

Architectural vs General Optimizations in Vision Transformers

Keywords: Attention optimization, Efficient architectures, Model compression

Abstract: This thesis compares two key approaches to optimizing Vision Transformers: architectural changes (e.g., attention modifications) and model compression techniques (e.g., pruning, quantization, distillation). It evaluates their interaction, robustness to compression, and the efficiency-accuracy trade-off across different networks and tasks.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)

References:

Papa et al., A Survey on Efficient Vision Transformers: Algorithms, Techniques, and Performance Benchmarking, 2023.
Schiavella et al.,Optimize vision transformer architecture via efficient attention modules: a study on the monocular depth estimation task. 2024.

Simulation of a Mulberry Plantation or Silkworm Farming

Keywords: Dataset development, 3D simulation, Green AI

Abstract: This thesis proposes the development of a 3D dataset for agricultural applications, featuring both point clouds and 3D mesh models. The dataset will support use cases such as simulation, virtual and augmented reality, and machine learning in smart farming scenarios.

In collaboration with: Tecnoseta SRL (https://www.tecnoseta.com/)

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Claudia Melis Tonti (melistonti@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it)

References:

Chang et al., Shapenet: An information-rich 3d model repository, 2015.

Semi-fragile Watermarks for Generative AI Authenticity

Keywords: Watermarking, Generative AI authenticity, Multimedia forensics

Abstract: This thesis investigates how the implementation of semi-fragile watermarking affects the detection of AI-generated visual media. It evaluates the effectiveness of semi-fragile watermarks in the context of generative content authenticity and the robustness under backdoor attacks and data poisoning.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Giuseppe Daidone (giuseppe.daidone@uniroma1.it)

References:

Deng, J., Lin, C., Zhao, Z., Liu, S., Peng, Z., Wang, Q., & Shen, C. (2025). A Survey of Defenses Against AI-Generated Visual Media: Detection, Disruption, and Authentication. ACM Computing Surveys. https://doi.org/10.1145/3770916
Yuan, Z., Zhang, X., Wang, Z., & Yin, Z. (2024). Semi-fragile neural network watermarking for content authentication and tampering localization. Expert Systems with Applications, 236, 121315–121315. https://doi.org/10.1016/j.eswa.2023.121315

Geometric Algebra for 3D human face movements representation

Keywords: Geometric Algebra, Equivariant Transformers, 3D Face Reconstruction

Abstract: The objective of this thesis is to exploit the Geometric Algebra (GA) in order to determine whether a 3D representation is accurated. The GA is a mathematical framework for geometrical computations. GA represents data as multivectors and describe both geometric objects as well as their transformations in three-dimensional space.

Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it) , Claudia Melis Tonti (melistonti@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it)

References:

Johann Brehmer and Pim de Haan and Sönke Behrends and Taco Cohen (2023). Geometric Algebra Transformer. https://doi.org/10.48550/2305.18415
Evangelos Sariyanidi and Claudio Ferrari and Federico Nocentini and Stefano Berretti and Andrea Cavallaro and Birkan Tunc (2025). 3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation. https://doi.org/10.48550/2505.18025

Thesis Collaborations with External Companies

Page updated

Report abuse