AutoVerus: Automated Proof Generation for Rust Code

🔗 Links

📑 Paper: [Preprint] [Arxiv]
🚀 AutoVerus Code: [Github Repo]
⚡ Verus-Bench Dataset: [Benchmark]

🆕 News

[2024/10/24] Our Code and Benchmark are released! Feel free to check them out over the links above!
[2024/9/19] Code and Benchmark are under an internal review process and will be released soon

Overview

AutoVerus consists of a network of LLM agents that are crafted and orchestrated to mimic human experts’ three phases of proof construction: preliminary proof generation, proof refinement guided by generic tips, and proof debugging guided by verification errors

Verus-Bench

To thoroughly evaluate AutoVerus and also to facilitate future research in AI-for-Verus, we have curated a benchmark suite that consists of 150 non-trivial proof tasks, with each task being a small Rust program associated with Verus specification. This is the first benchmark suite designed for Verus proof generation, and we refer to it as Verus-Bench. These 150 proof tasks are mainly translated from other benchmark suites (Diffy, MBPP, and CloverBench) into Rust/Verus

AutoVerus Results

AutoVerus can automatically generate correct proof for more than 90% of benchmark tasks, with more than half of them tackled in less than 30 seconds or 3 LLM calls

✉️ Contacts

Chenyuan Yang, cy54@illinois.edu
Xuheng Li, xuheng@cs.columbia.edu
Md Rakib Hossain Misu, mdrh@uci.edu
Shan Lu, shanlu@microsoft.com

Page updated

Google Sites

Report abuse