5th International Verification of Neural Networks Competition (VNN-COMP'24)

Sponsorship

We are pleased to share VNN-COMP'24 is sponsored by CEA-List

News / Updates

2024-12-30: the 2024 report is available on arxiv here: https://arxiv.org/abs/2412.19985
2024-10-10: the website for VNN-COMP'25 is online, but not yet fully updated: https://sites.google.com/view/vnn2025
2024-09-05: thanks to those who participated and joined us for the presentation at CAV/SAIV in Montreal; our apologies for delay in posting results and slides; the report is forthcoming and we will share later in fall 2024 as the draft becomes ready
- Results (please check/audit for accuracy): https://github.com/ChristopherBrix/vnncomp2024_results
- Slides: https://docs.google.com/presentation/d/1RvZWeAdTfRC3bNtCqt84O6IIPoJBnF4jnsEvhTTxsPE
2024-07-04: extended tool submission deadline another week to July 12
2024-06-24: extended tool submission deadline a week to July 5
2024-05-28: benchmark proposals have been finalized and tool participants should submit their votes for which benchmarks should be scored by June 3 AOE; please ensure you coordinate with your team and only submit one form; only tool participants that plan to submit results should submit the form given we are using the results to create the scored tracks: https://forms.gle/PQBW7tJeYUMJ9Dm3A
2024-05-05: rules finalization and new benchmark submissions deadlines extended a little; we will need to vote for the 2 track selections by end of May at the latest, so please be looking over the benchmarks proposal issue (as well as prior VNN-COMP benchmarks if any of those to be considered again this year): https://github.com/verivital/vnncomp2024/issues/2
2024-04-03: rules discussion and benchmark proposal is ongoing at issues in this github repository: https://github.com/verivital/vnncomp2024/issues/
- Rules document draft is here, will be revised soon based on rules telecon and github issues discussion: https://docs.google.com/document/d/1bgkx5lnugrwlNzQ2MPRSd47MAkZGJfR9v2jo7oRskd0/edit?usp=sharing
- Rules telecon recording is here: https://www.dropbox.com/scl/fi/24a16mp6lblyfr54mhkcn/GMT20240403-135359_Recording_gallery_2400x1600.mp4?rlkey=f3z86wqitxybvuzj09oii4bai&dl=0
2024-04-03: kickoff and rules telecon at 9am US central time (contact Taylor if passcode needed, sent via email to all registered participants who submitted the Google form)
2024-02-07: benchmark and tool registration form is here: https://forms.gle/TDng8k7Kd2qd8Pcu8
2024-01-15: VNN-COMP'24 is again co-located with CAV for 2024, with a new conference dedicated to AI verification co-located at CAV this year, the Symposium on AI Verification (SAIV'24); we will update and add more information soon: https://www.aiverification.org/
Useful tools / frameworks
- David Shriver provides a Python framework to help with parsing VNN-LIB here: https://github.com/dlshriver/vnnlib
- CoCoNet is also useful for VNN-LIB parsing and network interchange: https://github.com/NeVerTools/CoCoNet
- Christopher Brix provides the execution framework here (for submitting tools, benchmarks, etc.): https://vnncomp.christopher-brix.de/
Prior reports for reference
- 2024: https://arxiv.org/abs/2412.19985
- 2023: https://arxiv.org/abs/2312.16760
- Summary of 2020-2022: https://arxiv.org/abs/2301.05815 and https://link.springer.com/article/10.1007/s10009-023-00703-4
- 2022: https://arxiv.org/abs/2212.10376
- 2021: https://arxiv.org/abs/2109.00498
- 2020: Overleaf only: https://www.overleaf.com/read/rbcfnbyhymmy
VNN-LIB
- https://www.vnnlib.org/

General Information

The 5th International Verification of Neural Networks Competition (VNN-COMP'24), to be held with the 7th International Symposium on AI Verification (SAIV'24) with the 36th International Conference on Computer Aided Verification (CAV'24) over July 22-23, 2024, in Montreal, Canada, aims to bring together researchers interested in formal methods and tools providing guarantees about the behaviors of neural networks and systems built from them.

Introduction and Background

Methods based on machine learning are increasingly being deployed for a wide range of problems, including recommender systems, machine vision, autonomous driving, and beyond. While machine learning has made significant contributions to such applications, concerns remain about the lack of methods and tools to provide formal guarantees about the behaviours of the resulting systems.

In particular, for data-driven methods to be usable in safety-critical applications, including autonomous systems, robotics, cybersecurity, and cyber-physical systems, it is essential that the behaviours generated by neural networks are well-understood and can be predicted at design time. In the case of systems that are learning at run-time it is desirable that any change to the underlying system respects a given safety-envelope for the system.

While the literature on verification of traditionally designed systems is wide and successful, there has been a lack of results and efforts in this area until recently. The competition intends to bring together researchers working on techniques for the verification of neural networks. We anticipate a similar organization and process to the International Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP), where a categorization based on system expressiveness and problem formulation exists. In the context of VNN, this could for instance be different problem categories for whether a network is feedforward or recurrent (RNNs). Within these broad categorizations, further categorization may exist based on whether verification approaches support only certain layers (e.g., many approaches allow ReLUs, but relatively fewer allow nonlinear activations such as tanh). If you are interested, please contact the organizers. We anticipate analysis on existing benchmarks and challenge problems, such as ACAS-Xu, MNIST, CIFAR-10, but are also open to new challenge problems and benchmarks. We will follow similar procedures as in the 4th VNN-COMP.

Organizers

Stanley Bak, Stony Brook University [email]
Changliu Liu, Carnegie Mellon University [email]
Taylor T. Johnson, Vanderbilt University [email]
Christopher Brix, RWTH Aachen University [email]
David Shriver, [email]
Haoze (Andrew) Wu, Stanford University [email]

Important Dates

Intention to participate: March 22, 2024 March 18, 2024 (you may continue to join after this date, but please also email the organizers if you submit the form after this date so we can add you to listserv to get email updates, etc.): https://forms.gle/TDng8k7Kd2qd8Pcu8
Rules meeting (Zoom): April 3, 2024, 9am US central time: Zoom info sent by email to everyone who submitted registration form
Finalization of the rules: April 26, 2024 May 10, 2024
Submission of new benchmarks: May 6, 2024 May 17, 2024
Voting for selecting benchmarks in regular and extended tracks: May 31, 2024 June 3, 2024 AOE
Participants finalize tool scripts and organizers begin running tools: June 28, 2024 July 5, 2024 July 12, 2024 AOE
SAIV with presentation of VNN-COMP results and report: July 22-23, 2024

How To Participate / Registration

Given the infancy of the area, this friendly competition is a mechanism to share and standardize relevant benchmarks to enable easier progress within the domain, as well as to understand better on what methods are most effective for which problems along with current limitations. Subject to sponsorship, we may offer a "best competition results" award with a process that will involve community feedback for selection.

Register by March 18, 2024 (you may continue to join after this date, but please register sooner rather than later), if you are interested to learn more, participate, and help guide the direction of the competition, please use this Google form: https://forms.gle/TDng8k7Kd2qd8Pcu8

We welcome participation from all, and particularly have considered the following possible participants:

Verification / Robustness Tool Developers: you have a tool/method for proving properties of neural networks
Benchmark / Challenge Problem Proposers: you have neural networks and properties you would like to check for them, and can publicly share both
Sponsors and Others: you are interested in the area, but do not want to participate with a tool or benchmark/challenge problem, and/or would be interested in sponsoring a "best competition results" award

Participation will be done remotely in advance of SAIV, with summary results presented at SAIV. Attendance at SAIV by VNN-COMP participants will NOT be required, but of course you would be welcome to attend. Coordination mechanisms are open to discussion, but likely would be facilitated via, e.g., Git repositories, Slack, and/or forums such as Google Groups.

The mechanisms of the competition are community driven, along the lines of prior related competitions for hybrid systems verification (ARCH-COMP, https://cps-vo.org/group/ARCH/FriendlyCompetition ) and software verification (SV-COMP, https://sv-comp.sosy-lab.org/ ), and we welcome any suggestions for organization. Current plans anticipate some subdivisions of the competition into categories, such as by what layers/activations different tools and methods allow, whether they perform exact (sound and complete) or over-approximative (sound but incomplete) analysis, or training/synthesis vs. verification.

Anticipated benchmarks include those used in prior VNN-COMP iterations, such as ACAS-Xu, MNIST classifiers, CIFAR classifiers, etc., with various parameterizations (initial states, specifications, robustness bounds, etc.), and the mechanism for selection will be community driven, similar to the benchmark jury selection in related competitions. Participants are welcome to propose other benchmarks.

Some goals of this initiative are to lead to greater standardization of benchmarks, model formats (ONNX, etc.), etc., which has helped led to advances in other areas (e.g., SMT-LIB, http://smtlib.cs.uiowa.edu/ ), such as by making progress on the VNN-LIB initiative (http://www.vnnlib.org/ ), as well as to get some scientific sense on what the current landscape of methods and their applicability to different problems. Eventually, we hope the initiative will lead to better comparisons among methods. Depending on levels of participation, categories, etc., the outcome of the competition may be a series of competition reports and a repeatability evaluation.

Thank you for your consideration, and we hope you will participate. Please let us know of any questions, if you would like to discuss before making a decision, or any suggestions you may have for the organization of this initiative, as we believe it will be most successful if driven actively by the community.

Previous Workshops/Symposia/Competition

Sponsor

Please contact the organizers if you are interested to sponsor.

Page updated

Google Sites

Report abuse