Seminars
PhD Course on Explainable and Interpretable AI

Attendance

Seminars are organized as part of the PhD course; attendance to the seminars is open to everyone. Guest speakers will join by remote.

🧑‍🏫 Classroom: B203, DIAG Department (Via Ariosto 25).

💻 Zoom link: https://uniroma1.zoom.us/j/8212851330?omn=86160977093

Interpretability for Language Models: Current Trends and Applications
Gabriele Sarti, University of Groningen, Netherlands

November 5 2024, 11-12 AM

Abstract: In this presentation, I will provide an overview of the interpretability research landscape and describe various promising methods for exploring and controlling the inner mechanisms of generative language models. I will focus specifically on post-hoc attribution technique and their usage to identify relevant input and model components, showcasing their usage with our Inseq open-source toolkit. A practical application of attribution techniques will be presented with the PECoRe data-driven framework for context usage attribution and its adaptation to produce internals-based citations for model answers in retrieval-augmented generation settings (MIRAGE).

Towards interpretability-by-design
Pietro Barbiero, USI, Switzerland

November 7 2024, 11-12 AM

Abstract: Deep learning models are inherently opaque, making it difficult to understand their decision-making processes. Post-hoc explainable AI (XAI) methods aim to offer explanations for these models, but such explanations are often brittle and do not provide experts with reliable ways to intervene or adjust the trained models. Interpretability by design seeks to address this issue by building models that maintain the same predictive performance as opaque models, but are directly understandable without relying on post-hoc methods.

Page updated

Report abuse

SeminarsPhD Course on Explainable and Interpretable AI

Attendance

💻 Zoom link: https://uniroma1.zoom.us/j/8212851330?omn=86160977093

Interpretability for Language Models: Current Trends and ApplicationsGabriele Sarti, University of Groningen, Netherlands

Towards interpretability-by-designPietro Barbiero, USI, Switzerland

Seminars
PhD Course on Explainable and Interpretable AI

Interpretability for Language Models: Current Trends and Applications
Gabriele Sarti, University of Groningen, Netherlands

Towards interpretability-by-design
Pietro Barbiero, USI, Switzerland