A journey in the land of explainable AI (beautiful landscapes, horrendous pits and everywhere in between)

  1. Detailed outline

Detailed outline

Theoretical introduction to explainable AI (45 min)

Goal: Attendees will discover the framework of feature attribution, the theoretical grounding of the post-hoc approaches. They will also discover the prototype-based neural networks architecture and learning scheme. They will learn about what is considered an efficient explanation through the lens of social sciences.

  1. What constitutes a good explanations in social science
  2. Feature attribution as the goal of the explanation: exhibit relevant features for the current decision
  3. Post-hoc explanation: existing approaches on CUB200.
  • gradient-based: Saliency maps and variations, Smoothgrads, Integrated Gradient
  • perturbation-based: LIME and SHAP
  • limitations
    • sanity checks using randomization
    • manipulation through adversarial attacks and geometry
  1. By-design explanation: focus on ProtoPNet and ProtoTree
  • training scheme
  • important hyperparameters: tree depth, pruning
  • limitations
    • accuracy and explanation tradeoff
    • biases between the expected behaviour and actual prototypes

Post-hoc explanations (1h15)

Installation and setup (20 minutes)

Goal: attendees can run the tutorial code on their machine.

All the material will be presented as a Jupyter Notebook, as well as an installation script that will download the necessary dependencies, including dataset and pretrained model.

Explaining the predictions (20 minutes)

Goal: attendees can see the methods in action and their computation requirements

The attendees will apply the following approaches:

  • basic gradient retropropagation
  • SmoothGrads enhanced approach
  • Integrated Gradients

They will apply the methods on several images of the dataset, tune the parameters of the approaches and recover several feature attribution maps.

Limitations (25 minutes)

Goal: attendees get to see the limitations of the methods: variation to modifications and little usefulness for debugging. They wonder “what to do with an explanation that is simply off”?

Attendees will implement basic sanity check on post-hoc methods like randomized weigths, simple adversarial attacks as well as advanced explanation manipulation techniques based on existing work. They will learn that xAI method can be unrealiable under certain settings.

Explanations by design (1h15)

ProtoPNet and ProtoTree

Goal: attendees will manipulate the ProtoPNet and ProtoTree architectures. They will understand the tradeoffs between obtaining amenable explainability and accuracy. They will discover how to produce a global decision process from the architecture.

Using pre-trained self-explainable networks, attendees will:

  • generate global explanations for ProtoPNet and ProtoTree architectures;
  • study the influence of the pruning step in the tractability of the global explanation;
  • generate local explanations for individual images and understand the importance of the choice of faithful post-hoc XAI methods on the trustworthiness of the model;

Technical details of the tutorial

All the material will be made available publicly. This tutorial is based on a M2 course the presenter gave in february 2024.

The first part of the tutorial will use the Captum library to provide an easy interface to the various post-hoc explanation methods. The second part will use a soon-to-be-released, open source library developed at CEA LIST: Case-Based Reasoning Networks (CaBRNet). CaBRNet provide a clean interface to various prototype-based architectures.

Researcher on Trustworthy Artificial Intelligence