Program

Schedule

Thursday:

9am-10am: Joao Marques-Silva
10am-10:15am: coffee break
10:15am-11:15am: Alexandre Benoit
11:15am-12:15am: Vasileios Mezaris
12:15am-2pm: lunch break
2pm-2:30pm: Sebastian Bordt
2:30pm-3pm: Hugo Senetaire
3pm-3:30pm: Gianluigi Lopardo
3:30pm-3:45pm: coffee break
3:45pm-4:15pm: Martin Pawelcyk
4:15pm-4:45pm: Tristan Gomez
4:45pm-5:15pm: Gabriele Ciravegna

Friday:

9am-10am: Jean-Michel Loubes
10am-10:15am: coffee break
10:15am-11:15am: Yann Chevaleyre
11:15am-12:15am: Jenny Benois-Pineau
12:15am-2pm: lunch break
2pm-2:30pm: Salim Amoukou
2:30pm-3pm: Giorgio Visani
3pm-3:30pm: Hidde Fokkema
3:30pm-3:45pm: coffee break
3:45pm-4:15pm: Mara Graziani
4:15pm-4:45pm: Pietro Barbiero

Thursday

Jenny Benois-Pineau: FEM and MLFEM post-hoc explainers for CNNs and their evaluation with reference-based and no-reference quality metrics

In this talk we will present two method for explanation of trained CNN models we recently developed : FEM and MLFEM. These methods are based on the evaluation of the strength of features in the deepest convolution layer ( FEM) or several convolutional layers of a CNN performing image classification task. Without losing generality, the methods can be applied on any data which are being classified with a CNN. Furthermore, we propose evaluation methods for the quality of obtained explanation maps. We consider two general approaches for quality metrics design : reference-based and no-reference. In the case of image classification, as a reference we consider Gaze Fixation Density maps built upon gaze fixations of observers which have participated in psycho-visual experiment with the goal of the recognition of a visual scene. As quality metrics , are used those proposed in vision research for comparison of saliency maps. The no-reference method, by D. Alvarez Melis and T. Jakkola is based on the Lipschitz constant computation. We study the behavior of this metric as a function of strength of degradations induced on regional images. Furthermore, we explore the correlation of reference-based and no-reference metric. Our experimental studies show that FEM and MLFEM methods are outperform reference explainers, such as GradCam in the sense of both reference-based and no-reference metrics.

Joao Marques-Silva: Logic-Based Explainability in Machine Learning

The forecast applications of machine learning (ML) in high-risk and safety-critical applications hinge on systems that are deemed robust in their operation, and that can be understood about their decisions, and so trusted. Most ML models are neither robust nor understandable. This talk gives a broad overview of ongoing efforts in applying logic-enabled automated reasoning tools for explaining black-box ML models. The talk details the computation of rigorous explanations for the predictions made by black-box models, and illustrates how these serve to assess the quality of widely used heuristic explanation approaches. Finally, the talk briefly overviews a number of emerging topics of research in logic-enabled explainability.

Vasileios Mezaris: Explaining the decisions of image/video classifiers

We will start by discussing the main classes of explainability approaches for image and video classifiers. Then, we will focus on two distinct problems: learning how to derive explanations for the decisions of a legacy (trained) image classifier, and designing a classifier for video event recognition that can also deliver explanations for its decisions. Technical details of our proposed solutions to these two problems will be presented. Besides quantitative results concerning the goodness of the derived explanations, qualitative examples will also be discussed in order to provide insight on the reasons behind classification errors, including possible dataset biases affecting the trained classifiers.

Martin Pawelcyk: On the Trade-Off between Actionable Explanations and the Right to be Forgotten

As machine learning (ML) models are increasingly being deployed in high-stakes applications, policymakers have suggested tighter data protection regulations (e.g., GDPR, CCPA). One key principle is the "right to be forgotten" which gives users the right to have their data deleted. Another key principle is the right to an actionable explanation, also known as algorithmic recourse, allowing users to reverse unfavorable decisions. To date, it is unknown whether these two principles can be operationalized simultaneously. Therefore, we introduce and study the problem of recourse invalidation in the context of data deletion requests. More specifically, we theoretically and empirically analyze the behavior of popular state-of-the-art algorithms and demonstrate that the recourses generated by these algorithms are likely to be invalidated if a small number of data deletion requests (e.g., 1 or 2) warrant updates of the predictive model. For the setting of linear models and overparameterized neural networks -- studied through the lens of neural tangent kernels (NTKs) -- we suggest a framework to identify a minimal subset of critical training points which, when removed, maximize the fraction of invalidated recourses. Using our framework, we empirically show that the removal of as little as 2 data instances from the training set can invalidate up to 95 percent of all recourses output by popular state-of-the-art algorithms. Thus, our work raises fundamental questions about the compatibility of "the right to an actionable explanation" in the context of the "right to be forgotten" while also providing constructive insights on the determining factors of recourse robustness.

Tristan Gomez: Metrics for saliency maps faithfulness evaluation: an application to embryo stage identification

Due to the black-box nature of deep learning models, there is a recent development of solutions for visual explanations of CNNs. To evaluate the faithfulness of the explanations, various metrics were introduced. First, we critically analyze the Deletion Area Under Curve (DAUC) and Insertion Area Under Curve (IAUC) metrics proposed by Petsiuk et al. (2018). These metrics were designed to evaluate the faithfulness of saliency maps generated by generic methods such as Grad-CAM or RISE. We show that DAUC and IAUC suffer from two issues: (1) they generate out-of-distribution samples and (2) they ignore the saliency scores. To complement DAUC/IAUC, we propose new metrics that quantify the sparsity and the calibration of explanation methods, two previously unstudied properties. Next, we study the behavior of faithfulness metrics applied to the problem of embryo stage identification. We benchmark attention models and post-hoc methods and further show empirically that (1) the metrics produce low overall agreement on the model ranking and (2) depending on the metric approach, either post-hoc methods or attention models are favored. We conclude with general remarks about the difficulty of defining faithfulness and the necessity of understanding its relationship with the type of approach that is favored.

Sebastian Bordt: From Shapley Values to Generalized Additive Models and back

In explainable machine learning, local post-hoc explanation algorithms and inherently interpretable models are often seen as competing approaches. In this work, offer a novel perspective on Shapley Values, a prominent post-hoc explanation technique, and show that it is strongly connected with Glassbox-GAMs, a popular class of interpretable models. We introduce -Shapley Values, a natural extension of Shapley Values that explain individual predictions with interaction terms up to order . As increases, the -Shapley Values converge towards the Shapley-GAM, a uniquely determined decomposition of the original function. From the Shapley-GAM, we can compute Shapley Values of arbitrary order, which gives precise insights into the limitations of these explanations. We then show that Shapley Values recover generalized additive models of order , assuming that we allow for interaction terms up to order in the explanations. This implies that the original Shapley Values recover Glassbox-GAMs. At the technical end, we show that there is a one-to-one correspondence between different ways to choose the value function and different functional decompositions of the original function. This provides a novel perspective on the question of how to choose the value function. We also present an empirical analysis of the degree of variable interaction that is present in various standard classifiers, and discuss the implications of our results for algorithmic explanations. A python package to compute -Shapley Values and replicate the results in this paper is available here.

Hugo Sénétaire: Casting explainability as statistical inference

A wide variety of model explanation approaches have been proposed recently, all guided by very different rationales and heuristics. We take a new route and cast interpretability as a statistical inference problem. A general deep probabilistic model is designed to produce interpretable predictions. The model’s parameters can be learned via maximum likelihood, and the method can be adapted to any predictor network architecture and any type of prediction problem. Our method is a case of amortized interpretability models, where a neural network is used as a selector to allow for fast interpretation at inference time. Several popular interpretability methods are shown to be cases of regularised maximum likelihood for our general model. We propose new datasets with ground truth selection which allow for evaluating the features' importance map. Using these datasets, we show experimentally that using multiple imputations provides a more reasonable interpretation.

Gianluigi Lopardo: A Sea of Words: An In-Depth Analysis of Anchors for Text Data

Anchors (Ribeiro et al., 2018) is a post-hoc, rule-based interpretability method. For text data, it proposes to explain a decision by highlighting a small set of words (an anchor) such that the model to explain has similar outputs when they are present in a document. We present the first theoretical analysis of Anchors, considering that the search for the best anchor is exhaustive. After formalizing the algorithm for text classification, we present explicit results on different classes of models when the preprocessing step is TF-IDF vectorization, including elementary if-then rules and linear classifiers. We then leverage this analysis to gain insights on the behavior of Anchors for any differentiable classifiers. For neural networks, we empirically show that the words corresponding to the highest partial derivatives of the model with respect to the input, reweighted by the inverse document frequencies, are selected by Anchors.

Gabriele Ciravegna: Entropy-Based Logic Explanations of Neural Networks

Explainable artificial intelligence has rapidly emerged since lawmakers have started requiring interpretable models for safety-critical domains. Concept-based neural networks have arisen as explainable-by-design methods as they leverage human-understandable symbols (i.e. concepts) to predict class memberships. However, most of these approaches focus on the identification of the most relevant concepts but do not provide concise, formal explanations of how such concepts are leveraged by the classifier to make predictions. In this paper, we propose a novel end-to-end differentiable approach enabling the extraction of logic explanations from neural networks using the formalism of First-Order Logic. The method relies on an entropy-based criterion which automatically identifies the most relevant concepts. We consider four different case studies to demonstrate that: (i) this entropy-based criterion enables the distillation of concise logic explanations in safety-critical domains from clinical data to computer vision; (ii) the proposed approach outperforms state-of-the-art white-box models in terms of classification accuracy and matches black box performances.

Friday

Jean-Michel Loubes: Explainability of a Model under stress

We propose to study another type of explanation : the response of an algorithm when confronted to constraints on the test distribution. In order to avoid outliers we consider distributions that satisfy a stress constraint while being as close as possible to the original distribution. We define entropic projections under constraints that satisfy such conditions and thus provide some theoretical guarantees for such models. The method is analysed here.

Yann Chevaleyre: Learning interpretable scoring rules

Interpretability is a quite old topic in machine learning, which recently gained a lot of traction. In this talk, we will discuss about the need for interpretability in machine learning and what is meant by interpretability. Then, we will present the problem of learning interpretable scoring rules, and how this problem can be relaxed into a standard convex optimization problem. Finally, we will show some applications.

Alexandre Benoit: Explainable AI for Earth Observation

Earth Observation (EO), as for other domains, is subject to impressive advances thanks to the availability of abundant data, modern AI methods and more specifically deep neural networks. However, most of the available EO data is generally unlabelled, generally illustrates very local context with specific orientation, climate and so on such that the generalization behaviours of machine learning models can be limited. In addition, the implication of model inference applied to EO may lead to costly decisions such as infrastructure design or modification or crop yield. Then automatic decisions should be justified or explained. However, in the era of deep learning-based models, opening those black boxes is a challenge in itself. In this talk, we will present a variety of activities related to EO and explainable AI at LISTIC Lab. A focus on contributions related to explainable AI relying on 3 complementary directions : black box explanation, explanation by model design and redescription mining. These contributions highlight the interest of explanation methods combinations in order to present more concise and focused explanation to the human experts.

Salim Amoukou: Consistent Sufficient Explanations and Minimal Local Rules for explaining regression and classification models

To explain the decision of any model, we extend the notion of probabilistic Sufficient Explanations (P-SE). For each instance, this approach selects the minimal subset of features that is sufficient to yield the same prediction with high probability, while removing other features. The crux of P-SE is to compute the conditional probability of maintaining the same prediction. Therefore, we introduce an accurate and fast estimator of this probability via random Forests for any data (X, Y ) and show its efficiency through a theoretical analysis of its consistency. As a consequence, we extend the PSE to regression problems. In addition, we deal with non-binary features, without learning the distribution of X nor having the model for making predictions. Finally, we introduce local rule-based explanations for regression/classification based on the P-SE and compare our approaches w.r.t other explainable AI methods. These methods are publicly available as a Python package.

Giorgio Visani: Inspecting Stability and Reliability of Explanations

Explanations of automated decision systems are extremely important in highly regulated domains. One of the most well-known solutions to obtain model explanations is the LIME technique. In the talk, we will discuss the technique in general, with a special focus on its reliability. Ad-hoc Stability Indices are going to be presented as a tool to discern whether the explanations can be trusted. Building on the Stability Indices, the OptiLIME policy focuses on obtaining stable and reliable LIME explanations. Stability Indices and the OptiLIME policy represent an important step toward LIME compliance, from a regulatory point of view.

Hidde Fokkema: Attribution-based Explanations that Provide Recourse Cannot be Robust

When automated machine learning decisions lead to undesirable outcomes for users, recourse methods from explainable machine learning can inform users how to change the decisions. It is often argued that such explanations should be robust to small measurement errors in the users' features. We show that, unfortunately, this type of robustness is impossible to achieve for any method that also gives useful explanations whenever possible. We further discuss possible ways to work around our impossibility result, for instance by allowing the output to consist of sets with multiple attributions. Finally, we strengthen our impossibility result for the restricted case where users are only able to change a single attribute of x, by providing an exact characterization of the functions f to which impossibility applies.

Mara Graziani: Reliable AI in healthcare: from model validation to hypothesis generation

Deep learning models in healthcare are yielding exceptional results for the characterization of cancer biomarkers in imaging and molecular data, at times even exceeding human performance. However, assessing the reliability of the predicted outcomes is still a challenge, with predictions lacking robustness to covariate shifts. Moreover, it is still unclear what informative patterns lead to the high performance gains given by the deep models. In this talk, I discuss how model interpretability and reliable AI development can address the tasks of model validation. After briefly introducing the terminology related to reliable AI, I will provide examples on semi-transparent model designs that can be used to introduce desired inductive biases during model training. Finally, I will look at the future potential of interpretability developments for accelerating scientific discovery. In particular, I will discuss the potential of attention mechanisms for scientific hypotheses generation in histopathology.

Pietro Barbiero: Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

Human trust in deep neural networks is currently an open problem as their decision process is opaque. Current methods such as Concept Bottleneck Models make the models more interpretable at the cost of decreasing accuracy (or vice versa). To address this issue, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations.