VNN-COMP

VNN-COMP 2021

VNN-COMP is happening again in 2021 with CAV'21: https://sites.google.com/view/vnn2021

VNN-COMP 2020 Report

A draft (read only) version of the report is available on Overleaf here: https://www.overleaf.com/read/rbcfnbyhymmy

VNN-COMP 2020 Call for Participation

As a part of the Verification of Neural Networks (VNN, https://sites.google.com/view/vnn20/ ) workshop to be held at the International Conference on Computer-Aided Verification (CAV, http://i-cav.org/2020/ ) this July 2020, we are organizing a friendly competition for neural network verification (VNN-COMP). Given the infancy of the area, the competition will be friendly in the sense that there are no rankings in this iteration, and is primarily a mechanism to share and standardize relevant benchmarks to enable easier progress within the domain, as well as to understand better on what methods are most effective for which problems along with current limitations. Subject to sponsorship, we may offer a "best competition results" award with a process that will involve community feedback for selection.

Please email the chairs (Changliu Liu and Taylor Johnson) by February 29, 2020 (you may continue to join after this date, but please do get in touch sooner rather than later), if you are interested to learn more, participate, and help guide the direction of the competition. If there are other contacts from your group/working on your tools/methods, please indicate these as well so we can keep them informed as things progress.

We welcome participation from all, and particularly have considered the following possible participants:

Verification / Robustness Tool Developers: you have a tool/method for proving properties of neural networks
Benchmark / Challenge Problem Proposers: you have neural networks and properties you would like to check for them, and can publicly share both
Sponsors and Others: you are interested in the area, but do not want to participate with a tool or benchmark/challenge problem, and/or would be interested in sponsoring a "best competition results" award

Participation will be done remotely in advance of the workshop, with summary results presented at the workshop. Attendance at the workshop by VNN-COMP participants will NOT be required, but of course you would be welcome to attend. Coordination mechanisms are open to discussion, but likely would be facilitated via, e.g., Git repositories, Slack, and/or forums such as Google Groups.

The mechanisms of the competition are community driven, along the lines of prior related competitions for hybrid systems verification (ARCH-COMP, https://cps-vo.org/group/ARCH/FriendlyCompetition ) and software verification (SV-COMP, https://sv-comp.sosy-lab.org/ ), and we welcome any suggestions for organization. Current plans anticipate some subdivisions of the competition into categories, such as by what layers/activations different tools and methods allow, whether they perform exact (sound and complete) or over-approximative (sound but incomplete) analysis, or training/synthesis vs. verification.

Anticipated benchmarks include ACAS-Xu, MNIST classifiers, CIFAR classifiers, etc., with various parameterizations (initial states, specifications, robustness bounds, etc.), and the mechanism for selection will be community driven, similar to the benchmark jury selection in related competitions. Participants are welcome to propose other benchmarks.

Some goals of this initiative are to lead to greater standardization of benchmarks, model formats (ONNX, etc.), etc., which has helped led to advances in other areas (e.g., SMT-LIB, http://smtlib.cs.uiowa.edu/ ), such as by making progress on the VNN-LIB initiative (http://www.vnnlib.org/ ), as well as to get some scientific sense on what the current landscape of methods and their applicability to different problems. Eventually, we hope the initiative will lead to better comparisons among methods. Depending on levels of participation, categories, etc., the outcome of the competition may be a series of competition reports and a repeatability evaluation, as done with ARCH-COMP (see, e.g., https://easychair.org/publications/volume/ARCH19 and https://gitlab.com/goranf/ARCH-COMP).

An anticipated timeline for the competition is as follows.

February 29, 2020: initial participants identified, begin identification/proposal of benchmarks, categories, etc.
- Participants may continue to join after this date, but we encourage early participation especially if you would like to help guide the direction of this initiative, please email Changliu and Taylor to express your interest
~~March 31, 2020~~ April 30, 2020: finalize categories, participants in categories, etc.
~~April 30, 2020~~ ~May 15, 2020: finalize benchmarks in categories, distribute benchmarks to all category participants, etc.
June 30, 2020: finalize benchmark results
- Submission process (via pull requests to the VNN-COMP repository: https://github.com/verivital/vnn-comp/ )
  - Submit (1) a Dockerfile to set up your tool environment and (2) for each category in which you participate, a script to reproduce all benchmark results with all parameterizations considered (e.g., local robustness bounds, initial states / input images, specifications, etc.)
  - The output of this script ideally should produce a results table in a reasonable format (e.g., Latex for easy inclusion in the report discussed next)
  - We'll add further instructions for this, but we'll basically mirror the process used for repeatability evaluation in ARCH-COMP, see e.g.: https://gitlab.com/goranf/ARCH-COMP#repeatability-evaluation-instructions-2019
July 17, 2020: submit a 1-2 paragraph overview of your tool and your benchmark results to the VNN-COMP report (Overleaf shared with participants), we will take results from this report to present an overview at the VNN workshop
July 21, 2020 ~~July 19, 2020~~: workshop, presentation of VNN-COMP initial run, discussions on VNN-LIB, etc.

Thank you for your consideration, and we hope you will participate. Please let us know of any questions, if you would like to discuss before making a decision, or any suggestions you may have for the organization of this initiative, as we believe it will be most successful if driven actively by the community.

Updates/Organizational Details

We are managing discussion, etc. via Github through a shared repository and teams. If you would like to follow and/or participate, please send Taylor and Changliu your github username(s) if we haven't added you and we will add you.
https://github.com/verivital/vnn-comp
https://github.com/orgs/verivital/teams/vnn-comp20/
Update April 6, 2020: 3 primary categories have been identified, please add discussion and feedback
- Convolutional Neural Networks (CNNs): https://github.com/verivital/vnn-comp/issues/3
- Neural Networks with Piecewise Linear Activations, such as ReLUs: https://github.com/verivital/vnn-comp/issues/2
- Neural Networks with General Nonlinear Activations, such as tanh and sigmoid: https://github.com/verivital/vnn-comp/issues/1
April 16, 2020: please select what categories your tool may participate in, see the discussion here: https://github.com/verivital/vnn-comp/issues/4

Confirmed Interested Participants

NNV: https://github.com/verivital/nnv
DNNV: https://github.com/dlshriver/DNNV
nnenum (Stanley Bak: http://stanleybak.com/ ): https://github.com/stanleybak/nnenum
Neurify: https://github.com/tcwangshiqi-columbia/Neurify
ReluVal: https://github.com/tcwangshiqi-columbia/ReluVal
CROWN-IBP: https://github.com/huanzhang12/CROWN-IBP
auto_LiRPA: https://github.com/KaidiXu/auto_LiRPA
Sherlock: https://github.com/souradeep-111/sherlock_2
NPAQ: https://teobaluta.github.io/NPAQ/
Branch-and-bound: https://github.com/oval-group/PLNN-verification
PaRoT from FiveAI (https://five.ai/ ): based on paper https://arxiv.org/abs/2001.02152
MIPVerify.jl: https://github.com/vtjeng/MIPVerify.jl
ARFramework: https://github.com/formal-verification-research/ARFramework
Marabou: https://github.com/NeuralNetworkVerification/Marabou
Venus: https://vas.doc.ic.ac.uk/software/neural/
Verinet: https://vas.doc.ic.ac.uk/software/neural/
ERAN: https://github.com/eth-sri/eran
PeregriNN: https://github.com/rcpsl/PeregriNN