CAV 2025 wrapup

What I got from CAV 2025

I attended the Computer Aided Verification (CAV) Conference in Zagreb, Croatia.

My employer, CEA-List, sponsored the International Competition of Verification of Neural Networks where the PyRAT analyzer developped by colleagues competed. It was also an opportunity for me to meet old colleagues and foster community. This post acts as a personal summary of the conference.

Papers

Here is a list of papers I gathered as well as a short comment on why it should be moved somewhere in your reading pile.

Dynamic Algorithm Termination for Branch-and-Bound-based Neural Network Verification

A cool contribution to a currently underexplored area: neural network prover evaluations. Authors propose handcrafted features to provide something akin to a hardness model and propose early cutting of provers if the instance is deemed unfeasible.

In general, there are some very interesting things done by the chair of ML quality in Aachen University.

Verification of neural networks against convolutional perturbations via parameterised kernels

On the line of the DeepG algorithm and the idea of encoding semantic transformations directly within the neural network to verify.

https://github.com/sen-uni-kn/semantic-robustness-verification

Bridging neural ODE and ResNet: A formal error bound for safety verification

Authors provide tight robustness for neural ODEs. One may then wonder: if the ODE is used as a controller within a closed-loop system, would that allow some tighter checks compared to those obtained with simpler MLP?

Revisiting differential verification: Equivalence verification with confidence

Authors propose interesting contributions:

  • linear approximation of the softmax for class-ranking
  • zonotope domain for $\delta-$equivalence

Those approaches outline the possibility to formalize and check hyperproperties of neural networks by only changing the neural network and propose a sound derivation to VNN-Lib queries. Definitely suitable for the CAISAR platform which combines a rich specification language and automated compilation to provers.

Assured automatic programming via large language models

I am not an LLM enthusiast. The vision proposed by this talk, however, is interesting. Authors propose a system that will « co-evolve » specification, code and tests together, with the help of formal verification.

Neural network verification for gliding drone control: A case study

Authors attempt to verify a trajectory prediction problem for a nature-inspired drone pattern. They meet numerous limitations, which counts as as many future prospects for the field.

From $\epsilon$-Balls to Semantics: Towards Verifiable NLP Pipelines

Authors explains the actual difficulties of local robustness and NLP verification, namely the embedding gap between continuous and discrete spaces and propose « semantic-aware perturbations ».

They argue that probabilistic guarantees are needed as well, as deterministic formal verification simply does not catch all cases.

Abstraction-Based Proof Production in Formal Verification of Neural Networks

CORA + Marabou to propose proof abstractions to make proofs certificates in Marabou more easily amenable.

Robustness Margin: A new measure for the robustness of neural networks

(not yet published)

Leverage Robustness distributions and leverage « minimum acceptable quality » from statistics to compute robustness distributions to compare neural networks together. Again in the line of interesting use of robustness without formal verification.

GRENA: GPU-aided Abstract Refinement for Neural Network Verification

Propose to transform the classical MILP problem from NN into an unconstrained one.

CIFAR-10, 600s timeout for each image (that’s still super long compared to $\alpha-\beta-$CROWN).

Floating-Point Neural Networks Are Provably Robust Universal Approximators

Policy Verification in Stochastic Dynamical Systems Using Logarithmic Neural Certificates

Tools

VERONA

VERONA is a yet-to-be-released tool that proposes a binding to various neural network provers, allow to programmatically define benchmarks. Interestingly, it is not limited to doing local robustness. The currently supported input language is VNN-Lib.

SoftMatcha

SoftMatcha makes fuzzy/soft pattern-matching on large corpuses that’s fun

CTRAIN

CTRAIN, a robust training library on top of auto-Lirpa with automated hyperparameter tuning using SMAC3.

Ultimate Program Analyzer

The Ultimate Program Analyzer is a rich program analysis platform (much like Frama-C) relying on model-checking.

Neural Network Verify.jl

A Julia toolbox to perform neural network verification.

StarV

StarV proposes the usual box-based approaches for the verification of neural networks. Interestingly, they offer

NeuralSAT

NeuralSAT proposes a DPLL(T) framework for the verification of neural networks. While promising, I personally had doubts on the usefulness of DPLL(T) compared to $\alpha-\beta$-CROWN, as NeuralSAT uses the same backend. Nevertheless, using such formalism paves the way for including other SAT and SMT inspired ideas, in particular looking towards MCSAT.

Personal observations on the community

A few years early, I started developping a frustration on my research field. I felt we were focusing on too narrow problems while reproducing some personal issues I have with the activity of formal verification.

Attending CAV and the SAIV workshop partially alleviated those frustrations.

The field of formal verification of neural networks is definitely consolidating.

Some research groups are now to be considered important in the field. I am in particular thinking of people from the Hebrew University of Israël (Guy Katz and others) who first developped the Reluplex algorithm and are now gearing towards proof productions. RTW Aachen, whose submissions constituted a significant amount of presentations at the workshop, are other contenders. Colleagues from Southampton/Edinburgh are significantly contributing to pushing the boundaries of specification languages.

Research directions

Beyond local robustness

The field now moves away from the easy to formalize, low-level local robustness properties. We saw numerous contributions aiming towards so-called « global robustness ». The formalism of hyper-properties, from concurrent verification, provides a fresh perspective on the formalization and encoding of interesting problems.

Closed-loop systems

Control systems have been the subject of theoretical and practical attention for a very long time in computing. Recently, the idea of using neural networks to replace parts of a closed-loop components (usually, the plant) started to pose interesting questions on their certifications. One can then leverage the rich corpus on the theory of control (Lyapunov functions, Martingales) to devise desirable properties for neural networks to meet in this context. This requires specific abstractions for neural network provers to handle such models, but this seems like a really interesting problem to tackle.

Specification languages

Most common properties in neural network verification consists on mapping an input convex set to an output convex set. While allowing for efficient verification, this formalism prevents relational properties on the inputs and refrain exploring most of semantics properties. People from Vehicle (and our work) propose a nice new specification language that compiles down verification queries to backends. I also noted the growing interest on temporal properties.

The keynote from Corina Pasareanu, Through the Looking Glass: Semantic Analysis of Neural Networks, show in particular how one can devise a specification language for the specification of semantic properties based on concept extractions.

Integration with machine learning pipelines and useability

A crucial challenge for the field is to be palatable for machine learning practitionners. LLM is eating software and reducing software quality on global. A potential venue for formal verification is to be integrated tightly within AI development cycles, be it to constrain code generation, prove meaningful properties within a traditional software development pipeline

Our tendency for theory-rich and complex modelling should definitely not go away, as this reliance on sound mathematical principles provided some very strong industrial successes (and, I believe, have some underlying beauty). But complex formalism can be reduced to simple uses. Type theory is an example of that: any programming language with a type system benefits from decades of research, while being so tightly integrated within the language that this has become a natural thing for programmers to use it.

For formal methods to be applicable to AI, there is a need for integration within machine learning practice. Be it to provide sound explanations, provide sound translations from formal languages to informal ones (and backward)

Researcher on Trustworthy Artificial Intelligence