Large Language Model-Aided

Program Refinement and Verification

Program refinement is the verifiable transformation of an abstract (high-level) formal specification into a concrete (low-level) executable program.

Refine4LLM is a tool designed to run on Unix-based operating systems, such as Ubuntu/Linux and macOS. It leverages the power of Coq and LLM for formal verification and code generation. This guide will help you set up the necessary dependencies and environment to get started.

---

## Installation and Setup

### Prerequisites

- **Operating System**: Ubuntu/Linux or macOS

- **Package Manager**: `opam` (version 2.1.0 or later), `conda`

- **Coq**: Version 8.20.0

- **Provers**: Vampire, CVC4, Eprover, Z3

- **Python**: Version 3.10.15

---

### Install Coq and Dependencies with Opam

1. **Check Opam Version**

Ensure that `opam` is installed and its version is 2.1.0 or later:

# Refine4LLM

Refine4LLM is a tool designed to run on Unix-based operating systems, such as Ubuntu/Linux and macOS. It leverages the power of Coq and Python for formal verification and logical reasoning. This guide will help you set up the necessary dependencies and environment to get started.

---

## Installation and Setup

### Prerequisites

- **Operating System**: Ubuntu/Linux or macOS

- **Package Manager**: `opam` (version 2.1.0 or later)

- **Coq**: Version 8.20.0

- **Provers**: Vampire, CVC4, Eprover, Z3

---

### Install Coq and Dependencies with Opam

1. **Check Opam Version**

Ensure that `opam` is installed and its version is 2.1.0 or later:

`opam --version`

2. **Initialize Opam** (if new to Opam)

`opam init eval $(opam env)`

3. **Add Coq and Install Dependencies**

`opam pin add coq 8.20.0`

`opam repo add coq-released https://coq.inria.fr/opam/released`

`opam install coq-hammer coq-hammer-tactics`

---

### Install Provers

Refine4LLM integrates several provers for automatic theorem proving (ATP):

- **Vampire**: [Official Website](https://vprover.github.io)

- **CVC4**: [Official Website](http://cvc4.cs.stanford.edu)

*Requires version 1.6 or later for TPTP format support.*

- **Eprover**: [Official Website](http://www.eprover.org)

- **Z3**: [GitHub Releases](https://github.com/Z3Prover/z3/releases)

*Default Z3 versions may not support TPTP format. You need to compile the TPTP frontend.

Install the provers using `opam`:

`opam install z3_tptp`

---

### Install Python and Dependencies with Conda

1. Install Conda

If Conda is not installed, download the [Miniconda installer](https://docs.conda.io/en/latest/miniconda.html) suitable for your operating system.

Check if Conda is installed:

`conda --version`

---

2. Create a Conda Environment

Create a new Conda environment with Python 3.10:

`conda create -n refine4llm python=3.10.15`

Activate the environment:

`conda activate refine4llm`

---

3. Install Python Packages

In the activated Conda environment, install the required Python packages using pip:

`pip install openai lark sympy`

---

4. Verify Installation

Run the following command to ensure the libraries are installed correctly:

`python -c "import openai, lark, sympy; print('Libraries installed successfully!')"`

If successful, you will see the output:

`Libraries installed successfully!`

## LLM API

Note that the OpenAI API and LLMs are constantly evolving.

When we started this project, we used `gpt-4` (which should be gpt-4-0613). When we conduct the experiment, we use `gpt-4-turbo` (which should be gpt-4-0125-preview).

Users can also utilize the latest `GPT-4o` model with the instruct-tuning dataset stored in the `dataset` folder with their own OpenAI API KEY.

## Pre-Built Docker Image

To simplify the setup process, we provide a pre-built Docker image with everything you need already installed.

This image was created using a dedicated Dockerfile and includes all necessary components such as benchmarks, the instruct-tuning dataset, and dependencies like OPAM, Coq, CoqHammer, Z3, CVC4, Vampire, and E Prover.

By using the Docker image, you can avoid the complexities of manual installation and get started quickly with Refine4LLM.

## Tool Usage

Due to the credentials of OpenAI API keys, we do not make our keys public.

Please write an email to cai_yufan ATT u DOTT nus DOTT edu to get the specific API key and the docker system for usage.

We are now training a DeepSeek-based LLM to avoid using the OpenAI API key.

## Patent and Product

https://www.depintel.com/

## Citation

If you use Refine4LLM in your work, please cite our paper:

```bibtex

@article{10.1145/3704905,

author = {Cai, Yufan and Hou, Zhe and Sanan, David and Luan, Xiaokun and Lin, Yun and Sun, Jun and Dong, Jin Song},

title = {Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement Calculus},

year = {2025},

issue_date = {January 2025},

publisher = {Association for Computing Machinery},

address = {New York, NY, USA},

volume = {9},

number = {POPL},

url = {https://doi.org/10.1145/3704905},

doi = {10.1145/3704905},

journal = {Proc. ACM Program. Lang.},

month = jan,

articleno = {69},

numpages = {33},

keywords = {Large Language Model, Program Refinement, Program Synthesis}

}

```

## Credits

Refine4LLM is made thanks to the following tools and libraries:

- [Coq](https://github.com/coq/coq): A formal proof management system

- [CoqHammer](https://github.com/lukaszcz/coqhammer): An automatic theorem prover for Coq

- [SymPy](https://github.com/sympy/sympy): A Python library for symbolic mathematics

- [Openai](https://github.com/OPENAI): LLM and dataset provider.

- [Evalplus](https://github.com/evalplus/evalplus): Dataset Provider

- [Bigcode](https://github.com/bigcode-project/bigcode-evaluation-harness): Dataset Provider.

## Contributing

We welcome and appreciate contributions from the community.

Please send your questions to cai_yufan ATT u DOTT nus DOTT edu for further discussion.

Page updated

Google Sites

Report abuse