AlphaVerus

AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement

Overview

AlphaVerus is a novel framework that bootstraps formally verified code generation by iteratively translating programs from a higher-resource language and leveraging feedback from a verifier. The system operates in three main phases:

Exploration of Candidate Translations: Generates initial code translations
Treefinement: A novel tree search algorithm for program refinement using verifier feedback
Filtering: Prevents reward hacking by filtering misaligned specifications and programs

The framework teaches language models to generate verified code without human intervention or model finetuning.

Quick Start

Setup

Clone the repository:

git clone https://github.com/cmu-l3/alphaverus.git
cd alphaverus

Install dependencies:

pip install -r requirements.txt

Set up environment variables: Create a .env file in both training/ and inference/ directory with:

OPENAI_API_KEY=your_api_key_here
VERUS_PATH=path_to_verus_binary

Usage

Hosting the LLM API

Make sure your model is hosted on localhost with port 30000. We recommend using setting up openai-server in SGLang to host your model.

You can also configure to use exisiting models by setting url path in run_llm_api.py files in training and inference folders. No other changes would be required.

Training Phase

The training phase bootstraps the model with verified code translations:

[Optional] Configure training parameters in training/config.yaml
Run the training controller:

cd training
python controller.py

You can run multiple instances of controller.py in parallel to speed up the training proccess.

This will:

Manage training iterations and experiment state
Generate and verify code translations
Apply Treefinement for error correction
Filter misaligned specifications, preventing reward hacking attempts

Inference Phase

For generating new formally verified code contains 2 steps:

Exploration: Handled by solve.py
Treefinement: Handled by rebase.py
Run the exploration step:

cd inference
python solve.py [arguments]

Available arguments for solve.py:

--SAVE_DIR: Directory to save generated programs (default: 'solved_programs_hev')
--PROMPT_RANDOM_EXAMPLES: Path to examples file for prompting (default: '../training/iter5.json')
--BASE_PATH: Base path for saved exemplars (default: '.../training/')
--PROGRAMS_FILE: Input programs file (default: '../datasets/hev_4nov.jsonl')
--model: Model to use (default: 'default')
--temperature: Sampling temperature (default: 0.7)
--batch_size: Batch size for processing (default: 32)

Once the exploration step is complete, you can run the treefinement step to refine the incorrect solutions:

python rebase.py [mode] [file_name] [iteration_number]

Arguments for rebase.py:

mode: Integer indicating the mode for different variants of treefinement. Default is 0, specifying REBASE (default: 0)
file_name: Path to the input file, on which treefinement needs to be applied. Points to the file generated by solve.py (required)
iteration_number: iteration number for training phase (required)

Example:

python rebase.py 0 "solved_programs_hev/program_0.rs" 5

Project Structure

.
├── training/                  # Training phase implementation
│   ├── controller.py          # Main orchestration of training pipeline, manages parallelization, iterations and experiment state
│   ├── convert_single.py      # Handles individual program translations
│   ├── rebase_error_fix.py    # Implements Treefinement algorithm for error correction
│   ├── verify_translations.py # Verification and filtering of generated translations
│   ├── run_llm_api.py         # Interface for LLM API interactions
│   ├── verus_utils.py         # Utilities for Verus verification
│   ├── verus_error_utils.py   # Error parsing and handling for Verus
│   ├── exploit_model.py       # Code for Exploit Model exploitation detection
│   ├── exploit_prompt.py      # Prompt template for Exploit Model
│   ├── utils.py               # General utility functions
│   ├── check_lock_files.py    # Manages experiment locking mechanism incase of failiures
│   └── config.yaml            # Training configuration
│
├── inference/               # Inference phase implementation
│   ├── solve.py             # Main inference pipeline for generating verified code
│   ├── rebase.py            # Tree search implementation for refining generated code
│   ├── run_llm_api.py       # LLM API interface for inference
│   ├── verus_utils.py       # Verus verification utilities
│   ├── verus_error_utils.py # Error handling for Verus verification
│   └── extras/              # Additional utilities
│       ├── exploit_autoverus.py
│       ├── openai_contam_test.py
│       └── critic_model.py
├── datasets/
│   ├── hev.jsonl
│   └── mbpp.jsonl

Citation

If you use AlphaVerus in your research, please cite:

@misc{aggarwal2024alphaverusbootstrappingformallyverified,
      title={AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement}, 
      author={Pranjal Aggarwal and Bryan Parno and Sean Welleck},
      year={2024},
      eprint={2412.06176},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2412.06176}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
datasets		datasets
inference		inference
training		training
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AlphaVerus

AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement

Overview

Quick Start

Setup

Usage

Hosting the LLM API

Training Phase

Inference Phase

Project Structure

Citation

About

Releases

Packages

Languages

cmu-l3/alphaverus

Folders and files

Latest commit

History

Repository files navigation

AlphaVerus

AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement

Overview

Quick Start

Setup

Usage

Hosting the LLM API

Training Phase

Inference Phase

Project Structure

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages