Towards Adversarially Robust ML in The Age of The AI Act

We have uploaded the tutorial slides. You can find them at the following link!

Introduction

Artificial Intelligence (AI) has rapidly expanded into critical domains such as cybersecurity, natural language processing, and medicine. However, AI systems often prioritize predictive accuracy without accounting for adversarial manipulations or data noise, raising serious concerns about their trustworthiness in real-world applications. A growing body of research and numerous incidents of harmful or unintended AI behavior have amplified public skepticism, especially in Europe.

In response, the European Union introduced the AI Act (2024), the first comprehensive legal framework for regulating AI. Among its key principles, the Act mandates robustness and cybersecurity safeguards for high-risk AI systems, explicitly addressing emerging adversarial threats. These requirements are not only technical challenges but legal obligations, shaping how AI systems must be designed, validated, and maintained to ensure safety, accountability, and compliance. Therefore, AI developers and researchers must understand potential vulnerabilities and the regulatory steps required for adherence.

To bridge this gap, we propose a tutorial that systematically explores security threats targeting AI, including adversarial examples that induce misclassification or indecision, data poisoning attacks that inject ethical biases or create backdoors, and jailbreak techniques that manipulate generative models into producing harmful or toxic content. We will highlight the practical and regulatory significance of AI robustness, with a particular focus on European AI Act initiatives. The tutorial will also examine state-of-the-art adversarial defense strategies and explore emerging verification tools that can be utilized to assess both formal and empirical robustness guarantees in compliance with evolving regulations. The session will conclude with guidelines for future advancements in adversarial robustness, fostering the development of more secure and trustworthy AI systems.

Outline (Half-Day)

This tutorial explores robustness and adversarial threats in AI with a focus on compliance with the EU AI Act. Attendees will gain insights into cutting-edge defense strategies, verification techniques, and best practices for ensuring AI robustness and security.

The session is structured into five key parts, each designed to provide both foundational knowledge and practical guidance:

1. Opening & Motivation

Welcome and tutorial goals.
Real-world examples of AI failures in high-risk settings.
Trust, safety, and regulatory challenges.
Introduction to the EU AI Act: motivation, scope, and robustness requirements.

2. Understanding Adversarial Threats in ML

High-level taxonomy of adversarial threats.
Discussion of evaluation metrics and threat modeling.

3. Mitigation Strategies & Regulatory Alignment

Overview of defense techniques in practice.
Compliance-oriented defenses aligned with the AI Act.
Emerging tools for empirical and formal robustness verification.

4. Future Directions & Best Practices

Current limitations of defenses and verification.
Research challenges in adversarial robustness.
Best practices for AI developers and Policymakers.

5. Summary & Outlook

Recap of key concepts and methods.
The road ahead: Towards secure, trustworthy, and regulation-compliant AI.
Q&A and closing remarks.

About the presenters

Lorenzo Cazzaro

Lorenzo Cazzaro is a Postdoctoral Researcher in Computer Science at Ca’ Foscari University of Venice. He earned his Ph.D. with honors from the same university in 2021 under the supervision of Prof. Stefano Calzavara. In 2023, he interned at CISPA Helmholtz Center for Information Security under Prof. Giancarlo Pellegrino.

His research focuses on verifying security properties of machine learning models and watermarking ML models, with broader interests in adversarial ML and AI applications in cybersecurity. His work has advanced security verification for ML models on tabular data and explored ethical AI aspects like fairness, as well as ML-driven web application exploration and failure detection. His research has been published in top-tier venues such as IEEE S&P, ACM CCS, EDBT, and Computers & Security.

Email: lorenzo.cazzaro@unive.it

Antonio Emanuele Cinà

Antonio Emanuele Cinà is a Tenure-Track Researcher (RTDA) at the University of Genoa and a member of the SAIfer Lab. He was previously a Postdoctoral Researcher at CISPA Helmholtz Center for Information Security. He holds a Ph.D. with honors from Ca' Foscari University of Venice, where he also earned his Bachelor's and Master's degrees in Computer Science.

His research focuses on machine learning security and AI applications in cybersecurity, contributing to adversarial threat mitigation, robust benchmarking, and AI trustworthiness. He has also explored AI-driven techniques for detecting cybercriminals and scammers. His work has been published in top-tier venues such as ICLR, AAAI, USENIX, Pattern Recognition, and ACM Computing Surveys.

Email: antonio.cina@unige.it

Relevant References

Battista Biggio and Fabio Roli. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 2018.

Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Sebastiano Vascon, Werner Zellinger, Bernhard Alois Moser, Alina Oprea, Battista Biggio, Marcello Pelillo, and Fabio Roli. Wild patterns reloaded: A survey of machine learning security against training data poisoning. ACM Computing Survey 55(13s):294:1–294:39, 2023.

Alina Oprea and Apostol Vassilev. Adversarial machine learning: A taxonomy and terminology of attacks and mitigations. Technical report, National Institute of Standards and Technology, 2023.

Philipp Guldimann, Alexander Spiridonov, Robin Staab, Nikola Jovanovic, Mark Vero, Velko Vechev, Anna Gueorguieva, Mislav Balunovic, Nikola Konstantinov, Pavol Bielik, Petar Tsankov, and Martin T. Vechev. COMPL-AI framework: A technical interpretation and LLM benchmarking suite for the EU Artificial Intelligence Act. arXiv, 2024.

Nicholas Carlini and David A. Wagner. Towards evaluating the robustness of neural networks. IEEE Symposium on Security and Privacy, 2017.

A. E. Cinà, J. Rony, M. Pintor, L. Demetrio, A. Demontis, B. Biggio, I. B. Ayed, and F. Roli. Attackbench: Evaluating gradient-based attacks for adversarial examples. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 39. No. 3. 2025.

Xianmin Wang, Jing Li, Xiaohui Kuang, Yu-an Tan, and Jin Li. The security of machine learning in an adversarial setting: A survey. Journal of Parallel and Distributed Computing, 2019.

Caterina Urban and Antoine Miné. A review of formal methods applied to machine learning. arXiv preprint arXiv:2104.02466, 2021.

S. Calzavara, L. Cazzaro, G.E. Pibiri, N. Prezza. Verifiable learning for robust tree ensembles. ACM Conference on Computer and Communication Security, 2023.

S. Calzavara, L.Cazzaro, C. Lucchese, G.E. Pibiri. Verifiable boosted tree ensembles. IEEE Symposium on Security and Privacy, 2025.

Page updated

Google Sites

Report abuse