Organizers: Simone Scardapane, Gabriele Tolomei
The course will introduce basic concepts related to explaining and debugging neural network models. In particular, we will cover feature attribution methods, data attribution methods, and counter-factual explanations. The course will have both theory and practical sessions, and additional seminars from guest speakers.
The course is mandatory for all students of the PhD in Data Science, and open to all students of related PhD programs (e.g., National PhD in AI). In-person presence is suggested. Attendance to the seminars is open to everyone.
🧑🏫 Classroom: B203, DIAG Department (Via Ariosto 25).
Students are asked to select a recent paper on the topics of the course and present it in front of an audience.
Lecture: "Feature Attribution Methods", Simone Scardapane (Sapienza University).
[Slides]
Lecture: "Data Attribution Methods", Simone Scardapane (Sapienza University).
[Slides]
Seminar (11-12 AM): "Interpretability for Language Models: Current Trends and Applications", Gabriele Sarti (University of Groningen).
[Slides] [Video]
Lecture: "Mechanistic Interpretability", Simone Scardapane (Sapienza University).
[Slides]
Seminar (11-12 AM): "Towards Interpretability-by-Design", Pietro Barbiero (USI, Switzerland).
[Slides] [Video]
Lecture: "Counterfactual Explainability", Gabriele Tolomei (Sapienza University).
[Slides]
Seminar: "Counterfactual Explainability for Graphs: Challenges and Opportunities", Flavio Giorgi (Sapienza University).
[Slides]