3rd International Verification of Neural Networks Competition (VNN-COMP'22)

News / Updates


General Information

The 2022 Verification of Neural Networks Competition (VNN-COMP'22), to be held with the 5th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS) with CAV 2022 over July 31-August 1 2022, in Haifa, Israel, aims to bring together researchers interested in methods and tools providing guarantees about the behaviors of neural networks and systems built from them.

Introduction and Background

Methods based on machine learning are increasingly being deployed for a wide range of problems, including recommender systems, machine vision, autonomous driving, and beyond. While machine learning has made significant contributions to such applications, concerns remain about the lack of methods and tools to provide formal guarantees about the behaviours of the resulting systems. 

In particular, for data-driven methods to be usable in safety-critical applications, including autonomous systems, robotics, cybersecurity, and cyber-physical systems, it is essential that the behaviours generated by neural networks are well-understood and can be predicted at design time. In the case of systems that are learning at run-time it is desirable that any change to the underlying system respects a given safety-envelope for the system.

While the literature on verification of traditionally designed systems is wide and successful, there has been a lack of results and efforts in this area until recently. The competition intends to bring together researchers working on techniques for the verification of neural networks. We anticipate a similar organization and process to the International Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP), where a categorization based on system expressiveness and problem formulation exists. In the context of VNN, this could for instance be different problem categories for whether a network is feedforward or recurrent (RNNs). Within these broad categorizations, further categorization may exist based on whether verification approaches support only certain layers (e.g., many approaches allow ReLUs, but relatively fewer allow nonlinear activations such as tanh). If you are interested, please contact the organizers. We anticipate analysis on existing benchmarks and challenge problems, such as ACAS-Xu, MNIST, CIFAR-10, but are also open to new challenge problems and benchmarks. We will follow similar procedures in the 2nd VNN-COMP.

Organizers


Important Dates

How To Participate

Given the infancy of the area, the competition will be friendly in the sense that there are no rankings in this iteration, and is primarily a mechanism to share and standardize relevant benchmarks to enable easier progress within the domain, as well as to understand better on what methods are most effective for which problems along with current limitations. Subject to sponsorship, we may offer a "best competition results" award with a process that will involve community feedback for selection.

Register by March 18, 2022 (you may continue to join after this date, but please register sooner rather than later), if you are interested to learn more, participate, and help guide the direction of the competition, please use this google form.

We welcome participation from all, and particularly have considered the following possible participants:

Participation will be done remotely in advance of the workshop, with summary results presented at the workshop. Attendance at the workshop by VNN-COMP participants will NOT be required, but of course you would be welcome to attend. Coordination mechanisms are open to discussion, but likely would be facilitated via, e.g., Git repositories, Slack, and/or forums such as Google Groups.

The mechanisms of the competition are community driven, along the lines of prior related competitions for hybrid systems verification (ARCH-COMP, https://cps-vo.org/group/ARCH/FriendlyCompetition ) and software verification (SV-COMP, https://sv-comp.sosy-lab.org/ ), and we welcome any suggestions for organization. Current plans anticipate some subdivisions of the competition into categories, such as by what layers/activations different tools and methods allow, whether they perform exact (sound and complete) or over-approximative (sound but incomplete) analysis, or training/synthesis vs. verification.

Anticipated benchmarks include ACAS-Xu, MNIST classifiers, CIFAR classifiers, etc., with various parameterizations (initial states, specifications, robustness bounds, etc.), and the mechanism for selection will be community driven, similar to the benchmark jury selection in related competitions. Participants are welcome to propose other benchmarks.

Some goals of this initiative are to lead to greater standardization of benchmarks, model formats (ONNX, etc.), etc., which has helped led to advances in other areas (e.g., SMT-LIB, http://smtlib.cs.uiowa.edu/ ), such as by making progress on the VNN-LIB initiative (http://www.vnnlib.org/ ), as well as to get some scientific sense on what the current landscape of methods and their applicability to different problems. Eventually, we hope the initiative will lead to better comparisons among methods. Depending on levels of participation, categories, etc., the outcome of the competition may be a series of competition reports and a repeatability evaluation.

Thank you for your consideration, and we hope you will participate. Please let us know of any questions, if you would like to discuss before making a decision, or any suggestions you may have for the organization of this initiative, as we believe it will be most successful if driven actively by the community.

Sponsor