4th International Verification of Neural Networks Competition (VNN-COMP'23)

News / Updates

VNN-COMP'24: the website for the 5th iteration of VNN-COMP in 2024 is here: https://sites.google.com/view/vnn2024
Final report is on arxiv here, if any further revisions are needed, please let us know and we can update: https://arxiv.org/abs/2312.16760
Draft report is here, if you need edit access (for benchmarks/participants), please email Taylor: https://www.overleaf.com/read/csmzzbmswpwr#60641f
Slides from presentation are here: https://docs.google.com/presentation/d/1MSYvXoLHjyXW_TKeXPIEZKV4674sE2dj9h9vHzs14ak/edit?usp=sharing
July 18, 2pm (Paris time): presentation at FOMLAS, there is no official Zoom/remote support at the conference, but we will try to host a Zoom session here and attempt to record: https://vanderbilt.zoom.us/j/94887592563?pwd=bCtsRnJXWFpqc3dUZ0tPdm50bDZKZz09
- Meeting ID: 948 8759 2563
- Passcode: 2023
June 2023: benchmarks are available in the repository here: https://github.com/ChristopherBrix/vnncomp2023_benchmarks
May 2023: benchmark submission/discussion is ongoing, to be finalized at start of June: https://github.com/stanleybak/vnncomp2023/issues/2
April 2023: rules discussion ongoing, main repository for this year's iteration is here, with discussions on benchmarks/rules/etc in the github issues: https://github.com/stanleybak/vnncomp2023/
- David Shriver provides a Python framework to help with parsing VNN-LIB here: https://github.com/dlshriver/vnnlib
- Christopher Brix provides the execution framework here (for submitting tools, benchmarks, etc.): https://vnncomp.christopher-brix.de/
Prior reports for reference
- Summary of 2020-2022: https://arxiv.org/abs/2301.05815 and https://link.springer.com/article/10.1007/s10009-023-00703-4
- 2022: https://arxiv.org/abs/2212.10376
- 2021: https://arxiv.org/abs/2109.00498
- 2020: Overleaf only: https://www.overleaf.com/read/rbcfnbyhymmy
February 2023: created website for 4th VNN-COMP with preliminary schedule

General Information

The 2023 Verification of Neural Networks Competition (VNN-COMP'23), to be held with the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS) with CAV 2023 over July 17-18 2023, in Paris, France, aims to bring together researchers interested in methods and tools providing guarantees about the behaviors of neural networks and systems built from them.

Introduction and Background

Methods based on machine learning are increasingly being deployed for a wide range of problems, including recommender systems, machine vision, autonomous driving, and beyond. While machine learning has made significant contributions to such applications, concerns remain about the lack of methods and tools to provide formal guarantees about the behaviours of the resulting systems.

In particular, for data-driven methods to be usable in safety-critical applications, including autonomous systems, robotics, cybersecurity, and cyber-physical systems, it is essential that the behaviours generated by neural networks are well-understood and can be predicted at design time. In the case of systems that are learning at run-time it is desirable that any change to the underlying system respects a given safety-envelope for the system.

While the literature on verification of traditionally designed systems is wide and successful, there has been a lack of results and efforts in this area until recently. The competition intends to bring together researchers working on techniques for the verification of neural networks. We anticipate a similar organization and process to the International Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP), where a categorization based on system expressiveness and problem formulation exists. In the context of VNN, this could for instance be different problem categories for whether a network is feedforward or recurrent (RNNs). Within these broad categorizations, further categorization may exist based on whether verification approaches support only certain layers (e.g., many approaches allow ReLUs, but relatively fewer allow nonlinear activations such as tanh). If you are interested, please contact the organizers. We anticipate analysis on existing benchmarks and challenge problems, such as ACAS-Xu, MNIST, CIFAR-10, but are also open to new challenge problems and benchmarks. We will follow similar procedures as in the 3rd VNN-COMP.

Organizers

Stanley Bak, Stony Brook University [email]
Changliu Liu, Carnegie Mellon University [email]
Taylor T. Johnson, Vanderbilt University [email]
Christopher Brix, RWTH Aachen University [email]
David Shriver, [email]

Important Dates

Intention to participate: March 17, 2023 (you may continue to join after this date, but please also email the organizers if you submit the form after this date so we can add you to listserv to get email updates, etc.)
Rules meeting (Zoom): April 13, 2023
Finalization of the rules: April 28, 2023
Submission of benchmarks: May 15, 2023 June 2, 2023
Participants finalize tool scripts and organizers begin running tools: June 30, 2023
Workshop, presentation of VNN-COMP results and report: July 17/18, 2023

How To Participate / Registration

Given the infancy of the area, this friendly competition is a mechanism to share and standardize relevant benchmarks to enable easier progress within the domain, as well as to understand better on what methods are most effective for which problems along with current limitations. Subject to sponsorship, we may offer a "best competition results" award with a process that will involve community feedback for selection.

Register by March 17, 2023 (you may continue to join after this date, but please register sooner rather than later), if you are interested to learn more, participate, and help guide the direction of the competition, please use this Google form.

We welcome participation from all, and particularly have considered the following possible participants:

Verification / Robustness Tool Developers: you have a tool/method for proving properties of neural networks
Benchmark / Challenge Problem Proposers: you have neural networks and properties you would like to check for them, and can publicly share both
Sponsors and Others: you are interested in the area, but do not want to participate with a tool or benchmark/challenge problem, and/or would be interested in sponsoring a "best competition results" award

Participation will be done remotely in advance of the workshop, with summary results presented at the workshop. Attendance at the workshop by VNN-COMP participants will NOT be required, but of course you would be welcome to attend. Coordination mechanisms are open to discussion, but likely would be facilitated via, e.g., Git repositories, Slack, and/or forums such as Google Groups.

The mechanisms of the competition are community driven, along the lines of prior related competitions for hybrid systems verification (ARCH-COMP, https://cps-vo.org/group/ARCH/FriendlyCompetition ) and software verification (SV-COMP, https://sv-comp.sosy-lab.org/ ), and we welcome any suggestions for organization. Current plans anticipate some subdivisions of the competition into categories, such as by what layers/activations different tools and methods allow, whether they perform exact (sound and complete) or over-approximative (sound but incomplete) analysis, or training/synthesis vs. verification.

Anticipated benchmarks include ACAS-Xu, MNIST classifiers, CIFAR classifiers, etc., with various parameterizations (initial states, specifications, robustness bounds, etc.), and the mechanism for selection will be community driven, similar to the benchmark jury selection in related competitions. Participants are welcome to propose other benchmarks.

Some goals of this initiative are to lead to greater standardization of benchmarks, model formats (ONNX, etc.), etc., which has helped led to advances in other areas (e.g., SMT-LIB, http://smtlib.cs.uiowa.edu/ ), such as by making progress on the VNN-LIB initiative (http://www.vnnlib.org/ ), as well as to get some scientific sense on what the current landscape of methods and their applicability to different problems. Eventually, we hope the initiative will lead to better comparisons among methods. Depending on levels of participation, categories, etc., the outcome of the competition may be a series of competition reports and a repeatability evaluation.

Thank you for your consideration, and we hope you will participate. Please let us know of any questions, if you would like to discuss before making a decision, or any suggestions you may have for the organization of this initiative, as we believe it will be most successful if driven actively by the community.

Previous Workshops/Symposia/Competition

Sponsor

Please contact the organizers if you are interested to sponsor.