ECAI 2024 wrapup
My personal insights from ECAI 2024
This page serves as a personal reference to what I gathered from the European Conference on Artificial Intelligence (ECAI 2024).
A bit less-structured and incomplete version of this page is available as a thread on Mastodon.
Table of contents
Tutorials
Very cool tutorials I wanted to attend:
Play and Persuade: An Interactive Exploration of Argumentation
Elfia Bezou Vrakatseli and Madeleine Waller
An excellent introduction to the theory of formal argumentation. There was a good balance between the theoretical notions (arguments as graph representations, attacking and defending, formal dialog and argumentation structure) and actual use cases. Altough I was not an expert in the field, this tutorial gave me the right amount of background so that I can start digging up in there.
Explainable Constraint Solving: A Hands-on tutorial
Ignace Bleukx, Dimosthenis C. Tsouros and Tias Guns
I was not able to attend unfortunatly, but I believe that I have a lot of insight to get from explainable AI applied to symbolic AI. Link to the tutorial.
The Symbiosis of Neural Networks and Differential Equations: From Physics-Informed Neural Networks to Neural ODEs
Cecília Coelho and Luís Ferrás
I have been interested by neural ODEs for quite some time. The latent space of a neural network can be seen under the lens of numerical analysis - thus applying the rich set of techniques and knowledge of that field. The implications for xAI and robustness could be huge. Link to the tutorial.
Papers
Interpretable image classification through an argumentative dialog between encoders
Dao Thauvin, Stéphane Herbin, Wassila Ouerdane, Céline Hudelot
Super cool paper. They have two encoers $C$ (concept) and $P$ (prototypes). Classification explanation takes the form of a dialog between $C$ and $P$, where they mutually assign scores on a classification $k$ based on the following rules: $f(x)= k$ should result in $x$ sharing the prototypes of $k$ and $f(x)=k$ should result in $x$ sharing the same concepts as $k$. Overall, one of my favourite paper at the conference. It bridges conceptual/semantic elements and logical reasoning.
Unveiling the Power of Sparse Neural Networks for Feature Selection
Atashagi et al.
Learning logic programs by combining programs
Cropper et al.
Adversarial attacks for explanation robustness of Rationalization Models
Zhang et al.
Answerable Sociotechnical Systems
Kekulluoglu et al.
On the computation of constrastive explanation for Boosted Regression trees
Audelard et al.
Defending our privacy with backdoors
Hintersdof et al.
Using a loss on a Diffusion model to remove a specific data. Works well on textual data, not so much on visual data (because triggering the backdoor is more difficult on images).
Pixel-wise reclassification with prototypes for enhancing weakly supervised semantic segmentation
Bridging formal methods and machine learning with model checking and global optimisation
Bensalem et al.
EMILY: Extracting sparse Model from ImpLicit dYnamics
A neuro-symbolic approach for faceted search in digital libraries
Automated Synthesis of Certified Neural Networks
Zavaretti et al.
Target-drived Attack for LLM
Zhang et al.
Verification of Geometric Robustness of Neural Network via Piecewise Linear Approximation and Lipschitz Optimisation
Backward Compatibility in Attributive Explanation and Enhanced Model Training methods
Matsuno et al.
This paper presents a metric and loss to ensure that an update model keeps a comparable behaviour regarding attribution methods.
NeurCAM: Interpretable Neural Clustering via Additive Models
IndMask: Inductive Explanation for Multivariate Time Series Black-Box Models
Locally-Minimal Probabilistic Explanations
Izza et al.
A Domain Specific Language for Describing Diverse Types of Dialog
S. Wells and C. A. Reed
Something to read about how to design a language to describe a dialog protocol
Scientific and technical insights
On the limitations of feature attribution
Compared to the field of attribution techniques applied for deep learning (gradient-based or attribution-based approaches), the field of explainabity for symbolic AI provides a vaster range of techniques and theory. I learned about argumentation theory: how to structure and analyze how arguments are built, and how they can be structured in a argumentative discourse.
During my tutorial, I realized that most of attribution techniques for deep learning were leaving so much mental labour to their user. A typical example would be having an attribution map $M_f(x)$ on a neuronal function $f(x)=y$ that does not corresponds to human expectations. There is first the cognitive and engineering work of asserting the performance of $f$ under other criterions: accuracy + recall on a test set, dataset quality, robustness against perturbations, anomaly detection… Then, providing $f$ is acceptable under the other criterions, one would still need to analyze some examples of $M_f(x)$ with different samples. In that sense, one may gain insights on the actual behaviour of $f$ on given samples (as in “$M_f(x)$ consistently outlines a particular speck of background on the image, so this data contains a lot of information”). Applying perturbation-based analysis (like I’m doing in my ECAI 2024 tutorial) may inform on whether a prototype is capturing informations about variations in noise, hue or other perturbations. But this still needs to be done manually, requires expertise and insight and, more importantly, have no rigorous guarantee on the result.
Compare for instance with computational argumentation theory. First, attribution is only but one of the numerous kind of arguments that exist in that theory (contrastive, drawing from other examples, ).
Second, while an attribution is provided as is, arguments can be structured in a way that they attack each other. A very basic example: if different explanations $M_f(x)$ and $M^{’}_f(x)$ are outlining the same feature, there is a good chance that there is an agreement to be have on the importance of said features 1. This agreement can be presented to the user as a first start for an argument on the content and influence of said features. Then, automated perturbation analysis can be conducted, and their results could inform on the global argument.
Said discourse can have formal guarantees, such as terminations. It can also integrate notions of “inconclusiveness” (something that neural networks in general and LLMs in particular are incapable of doing; to the point that the latter can be coined as Mansplaining as a service).
There is a lot to be said on the brittleness of attribution technique and their potential non-fidelity on the model (again, see my tutorial but also plenty of existing work like Sanity Checks on Saliency Maps. Hopefully, there is a path to walk where formal methods increase our confidence on the attributions.
Integration of perceptual inputs into symbolic reasoning
I believe that attributions at the pixel-level is not the best way to approach explanation. If we compare with symbolic AI
A member of my PhD defense committee, Gilles Dowek, once made an interesting short video (in french) on the parallel between explications and proposition demonstrations. A demonstration is more explicative than another is it presents more generalization capabilities (e.g. it allows a deeper understanding of a given phenomenon by exposing more of its underlying governing laws).
The main issue I see with pixel-level reasoning is that it is difficult to reason on pixels alone. I believe that humans don’t conceive the world on the pixel-level 2. I recognize a cat because of a combination of its sound, its fur, tail and ears. I am also capable of recognizing a cat even if it lacks some of those features (a blind or a furless cat remains a cat). But a change on pixel vastly changes the output of $M_f(x)$.
To be grasped by a human, I believe that an explanation at pixel-level must present ideas on a higher-perceptual level. It must also be included into some kind of reasoning framework (I believe the wording for that is “neuro-symbolic reasoning”).
Reproducibility and methodology
I had several discussions with people coming from different backgrounds, from computer science to education, psychology and political sciences. We shared the overall opinion that a good amount of presented papers would be difficult to reproduce in the near future. We shared personal experiences on trying to reproduce papers and, more often than not, failing to do so (drawing from what we did at CEA with the CaBRNet library). One of our PhD student sent me a small post that further illustrates this. Large Language Models are only making this worse. Any change in OpenAI API could result in the irreproducibility of results. Backing up research on proprietary, non-auditable research may be profitable in the short term (for funding or carreer) but, to my opinion, hurts research in the long run.
Our papers are usually quite short on the methodology section. We assume that simply linking to a GitHub repo and outlining the architecture is enough. It is known that there exist financial incentives for software forge to delete unused repositories ( as GitLab almost did), and that the magic word “cross-validation” is used to put a lot of things under the carpet. Increasing the quality on those sections (and rewarding authors who do indeed fill those) would be a good start. Training on software reproduceability should be part of all PhD training programs. This may come from learning how to write a proper Dockerfile or Nix derivation to refrain using multiple random seeds; and provide a CPU-friendly version of experiments (not everyone has a GPU-farm).
Panel session: AI Act
Disclaimer: I am neither a law trainee nor a particularly astute political commentator. Those opinions are, again, only personal.
The goal of that panel was to present the upcoming AI Act, and to transmit the underlying vision(s).
The participants of that panel were Kilian Gross (Head of Unit AI Office, European Commission), Clara Nappel (Head of IEEE Europe) and Beatriz Alvargonzalez Largo (Economical Advisor for the European Commission).
Kilian presented the overall AI Act classification of AI following a risk-based approach. AI that is considered applied to “unacceptable risk” are prohibited: those include online biometrics, untargeted data scraping on the internet, social scoring and emotional recognition. Then, the “high-risk” system basically embeds most of the AI use cases I know of, and apparently also include biometrics.
There will be “regulatory sandboxes”, which I assume is a way to not directly kill the EU market with laws that may be too constraining. One may wonder how such regulatory sandboxes are created and monitored over.
We were several in the audience to be intriguided by the definition of said “General Purpose AI Models” (GPAI). Only vaguely defined under a metric that made no sense ($10^{25}$ FLOPs) or designated by a special AI office, they clearly target existing and future to be LLMs.
Overall, my impression was that there were a lot of contradictions. There’s a weird comparison with how the AI Act can get inspiration from nuclear regulation. The big difference (at least in France) is that, for nuclear power, regulation and technological knowledge are developped together: a regulatory agency (ASN) and a technical one (IRSN), which are working closely toegether. Here, the AI Act aims to regulate things developped outside of its immediate capability (LLMs is mostly funded and developped by american or chinese corporations). There is a wishful thinking into the model-to-be of EU AI industry: a world-expertise on AI compliance assessment. I am not sure how this will turn out.
Misc.
The conference venue was a bit remote from the city center, so the organizers setup a shuttle from the central city. The workshops and tutorials were located in the northern campus, which was an easy stroll to most of central hotels.
Some effort was made by the organizers on providing vegan and vegetarian food, with clear labels and locations for each meals. However, we experienced the classical issue where people with no particular special diet also took on the vegetarian ones, resulting in some shortages. It is not an easy problem to solve, as we don’t want to discourage people trying a vegetarian diet (and having professional quality food for a mass event is one of the best setting to do so). Maybe ordering a bit more of special diets and starting to save it for people who are indeed on those special diets could be a nice first step.
As for all my travels (both personal and professional), I went to Santiago by train. It took me two days of travel total; with a stop at Barcelona. I only encountered a 3h delay during the return trip that resulted in a missed connection in Madrid; fortunately there still are a lot of high-speed trains so I had no trouble to arrive.
-
The opposite is not true: two diverging explanations do not tell you that the outlined features are irrelevant. In Fairwashing Explanations with Out Manifold Detergent, the authors invoke a geometrical argument on how vectors driving explanations are orthogonal to those responsible from the prediction. Nevertheless, it is difficult to conclude anything on disagreeing attributions alone. ↩︎
-
Cognition theory certainly has a lot to say on that matter: I am not an expert and thus those claims are only informed by personal experience (and my deep unsatisfaction with the field of attribution methods). ↩︎