VNN-COMP

VNN-COMP 2021

VNN-COMP is happening again in 2021 with CAV'21: https://sites.google.com/view/vnn2021

VNN-COMP 2020 Report

A draft (read only) version of the report is available on Overleaf here: https://www.overleaf.com/read/rbcfnbyhymmy

VNN-COMP 2020 Call for Participation

As a part of the Verification of Neural Networks (VNN, https://sites.google.com/view/vnn20/ ) workshop to be held at the International Conference on Computer-Aided Verification (CAV, http://i-cav.org/2020/ ) this July 2020, we are organizing a friendly competition for neural network verification (VNN-COMP). Given the infancy of the area, the competition will be friendly in the sense that there are no rankings in this iteration, and is primarily a mechanism to share and standardize relevant benchmarks to enable easier progress within the domain, as well as to understand better on what methods are most effective for which problems along with current limitations. Subject to sponsorship, we may offer a "best competition results" award with a process that will involve community feedback for selection.

Please email the chairs (Changliu Liu and Taylor Johnson) by February 29, 2020 (you may continue to join after this date, but please do get in touch sooner rather than later), if you are interested to learn more, participate, and help guide the direction of the competition. If there are other contacts from your group/working on your tools/methods, please indicate these as well so we can keep them informed as things progress.

We welcome participation from all, and particularly have considered the following possible participants:

  • Verification / Robustness Tool Developers: you have a tool/method for proving properties of neural networks

  • Benchmark / Challenge Problem Proposers: you have neural networks and properties you would like to check for them, and can publicly share both

  • Sponsors and Others: you are interested in the area, but do not want to participate with a tool or benchmark/challenge problem, and/or would be interested in sponsoring a "best competition results" award

Participation will be done remotely in advance of the workshop, with summary results presented at the workshop. Attendance at the workshop by VNN-COMP participants will NOT be required, but of course you would be welcome to attend. Coordination mechanisms are open to discussion, but likely would be facilitated via, e.g., Git repositories, Slack, and/or forums such as Google Groups.

The mechanisms of the competition are community driven, along the lines of prior related competitions for hybrid systems verification (ARCH-COMP, https://cps-vo.org/group/ARCH/FriendlyCompetition ) and software verification (SV-COMP, https://sv-comp.sosy-lab.org/ ), and we welcome any suggestions for organization. Current plans anticipate some subdivisions of the competition into categories, such as by what layers/activations different tools and methods allow, whether they perform exact (sound and complete) or over-approximative (sound but incomplete) analysis, or training/synthesis vs. verification.

Anticipated benchmarks include ACAS-Xu, MNIST classifiers, CIFAR classifiers, etc., with various parameterizations (initial states, specifications, robustness bounds, etc.), and the mechanism for selection will be community driven, similar to the benchmark jury selection in related competitions. Participants are welcome to propose other benchmarks.

Some goals of this initiative are to lead to greater standardization of benchmarks, model formats (ONNX, etc.), etc., which has helped led to advances in other areas (e.g., SMT-LIB, http://smtlib.cs.uiowa.edu/ ), such as by making progress on the VNN-LIB initiative (http://www.vnnlib.org/ ), as well as to get some scientific sense on what the current landscape of methods and their applicability to different problems. Eventually, we hope the initiative will lead to better comparisons among methods. Depending on levels of participation, categories, etc., the outcome of the competition may be a series of competition reports and a repeatability evaluation, as done with ARCH-COMP (see, e.g., https://easychair.org/publications/volume/ARCH19 and https://gitlab.com/goranf/ARCH-COMP).

An anticipated timeline for the competition is as follows.

  • February 29, 2020: initial participants identified, begin identification/proposal of benchmarks, categories, etc.

    • Participants may continue to join after this date, but we encourage early participation especially if you would like to help guide the direction of this initiative, please email Changliu and Taylor to express your interest

  • March 31, 2020 April 30, 2020: finalize categories, participants in categories, etc.

  • April 30, 2020 ~May 15, 2020: finalize benchmarks in categories, distribute benchmarks to all category participants, etc.

  • June 30, 2020: finalize benchmark results

    • Submission process (via pull requests to the VNN-COMP repository: https://github.com/verivital/vnn-comp/ )

      • Submit (1) a Dockerfile to set up your tool environment and (2) for each category in which you participate, a script to reproduce all benchmark results with all parameterizations considered (e.g., local robustness bounds, initial states / input images, specifications, etc.)

      • The output of this script ideally should produce a results table in a reasonable format (e.g., Latex for easy inclusion in the report discussed next)

      • We'll add further instructions for this, but we'll basically mirror the process used for repeatability evaluation in ARCH-COMP, see e.g.: https://gitlab.com/goranf/ARCH-COMP#repeatability-evaluation-instructions-2019

  • July 17, 2020: submit a 1-2 paragraph overview of your tool and your benchmark results to the VNN-COMP report (Overleaf shared with participants), we will take results from this report to present an overview at the VNN workshop

  • July 21, 2020 July 19, 2020: workshop, presentation of VNN-COMP initial run, discussions on VNN-LIB, etc.

Thank you for your consideration, and we hope you will participate. Please let us know of any questions, if you would like to discuss before making a decision, or any suggestions you may have for the organization of this initiative, as we believe it will be most successful if driven actively by the community.

Updates/Organizational Details

Confirmed Interested Participants

  1. NNV: https://github.com/verivital/nnv

  2. DNNV: https://github.com/dlshriver/DNNV

  3. nnenum (Stanley Bak: http://stanleybak.com/ ): https://github.com/stanleybak/nnenum

  4. Neurify: https://github.com/tcwangshiqi-columbia/Neurify

  5. ReluVal: https://github.com/tcwangshiqi-columbia/ReluVal

  6. CROWN-IBP: https://github.com/huanzhang12/CROWN-IBP

  7. auto_LiRPA: https://github.com/KaidiXu/auto_LiRPA

  8. Sherlock: https://github.com/souradeep-111/sherlock_2

  9. NPAQ: https://teobaluta.github.io/NPAQ/

  10. Branch-and-bound: https://github.com/oval-group/PLNN-verification

  11. PaRoT from FiveAI (https://five.ai/ ): based on paper https://arxiv.org/abs/2001.02152

  12. MIPVerify.jl: https://github.com/vtjeng/MIPVerify.jl

  13. ARFramework: https://github.com/formal-verification-research/ARFramework

  14. Marabou: https://github.com/NeuralNetworkVerification/Marabou

  15. Venus: https://vas.doc.ic.ac.uk/software/neural/

  16. Verinet: https://vas.doc.ic.ac.uk/software/neural/

  17. ERAN: https://github.com/eth-sri/eran

  18. PeregriNN: https://github.com/rcpsl/PeregriNN