Skip to content

AlphaVerus: Formally Verified Code Generation through Self-Improving Translation and Treefinement

Notifications You must be signed in to change notification settings

cmu-l3/alphaverus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AlphaVerus

AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement



Overview

AlphaVerus is a novel framework that bootstraps formally verified code generation by iteratively translating programs from a higher-resource language and leveraging feedback from a verifier. The system operates in three main phases:

  1. Exploration of Candidate Translations: Generates initial code translations
  2. Treefinement: A novel tree search algorithm for program refinement using verifier feedback
  3. Filtering: Prevents reward hacking by filtering misaligned specifications and programs

The framework teaches language models to generate verified code without human intervention or model finetuning.

Quick Start

Setup

  1. Clone the repository:
git clone https://github.com/cmu-l3/alphaverus.git
cd alphaverus
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables: Create a .env file in both training/ and inference/ directory with:
OPENAI_API_KEY=your_api_key_here
VERUS_PATH=path_to_verus_binary

Usage

Hosting the LLM API

Make sure your model is hosted on localhost with port 30000. We recommend using setting up openai-server in SGLang to host your model.

You can also configure to use exisiting models by setting url path in run_llm_api.py files in training and inference folders. No other changes would be required.

Training Phase

The training phase bootstraps the model with verified code translations:

  1. [Optional] Configure training parameters in training/config.yaml

  2. Run the training controller:

cd training
python controller.py

You can run multiple instances of controller.py in parallel to speed up the training proccess.

This will:

  • Manage training iterations and experiment state
  • Generate and verify code translations
  • Apply Treefinement for error correction
  • Filter misaligned specifications, preventing reward hacking attempts

Inference Phase

For generating new formally verified code contains 2 steps:

  1. Exploration: Handled by solve.py

  2. Treefinement: Handled by rebase.py

  3. Run the exploration step:

cd inference
python solve.py [arguments]

Available arguments for solve.py:

  • --SAVE_DIR: Directory to save generated programs (default: 'solved_programs_hev')
  • --PROMPT_RANDOM_EXAMPLES: Path to examples file for prompting (default: '../training/iter5.json')
  • --BASE_PATH: Base path for saved exemplars (default: '.../training/')
  • --PROGRAMS_FILE: Input programs file (default: '../datasets/hev_4nov.jsonl')
  • --model: Model to use (default: 'default')
  • --temperature: Sampling temperature (default: 0.7)
  • --batch_size: Batch size for processing (default: 32)

Once the exploration step is complete, you can run the treefinement step to refine the incorrect solutions:

python rebase.py [mode] [file_name] [iteration_number]

Arguments for rebase.py:

  • mode: Integer indicating the mode for different variants of treefinement. Default is 0, specifying REBASE (default: 0)
  • file_name: Path to the input file, on which treefinement needs to be applied. Points to the file generated by solve.py (required)
  • iteration_number: iteration number for training phase (required)

Example:

python rebase.py 0 "solved_programs_hev/program_0.rs" 5

Project Structure

.
├── training/                  # Training phase implementation
│   ├── controller.py          # Main orchestration of training pipeline, manages parallelization, iterations and experiment state
│   ├── convert_single.py      # Handles individual program translations
│   ├── rebase_error_fix.py    # Implements Treefinement algorithm for error correction
│   ├── verify_translations.py # Verification and filtering of generated translations
│   ├── run_llm_api.py         # Interface for LLM API interactions
│   ├── verus_utils.py         # Utilities for Verus verification
│   ├── verus_error_utils.py   # Error parsing and handling for Verus
│   ├── exploit_model.py       # Code for Exploit Model exploitation detection
│   ├── exploit_prompt.py      # Prompt template for Exploit Model
│   ├── utils.py               # General utility functions
│   ├── check_lock_files.py    # Manages experiment locking mechanism incase of failiures
│   └── config.yaml            # Training configuration
│
├── inference/               # Inference phase implementation
│   ├── solve.py             # Main inference pipeline for generating verified code
│   ├── rebase.py            # Tree search implementation for refining generated code
│   ├── run_llm_api.py       # LLM API interface for inference
│   ├── verus_utils.py       # Verus verification utilities
│   ├── verus_error_utils.py # Error handling for Verus verification
│   └── extras/              # Additional utilities
│       ├── exploit_autoverus.py
│       ├── openai_contam_test.py
│       └── critic_model.py
├── datasets/
│   ├── hev.jsonl
│   └── mbpp.jsonl

Citation

If you use AlphaVerus in your research, please cite:

@misc{aggarwal2024alphaverusbootstrappingformallyverified,
      title={AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement}, 
      author={Pranjal Aggarwal and Bryan Parno and Sean Welleck},
      year={2024},
      eprint={2412.06176},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2412.06176}, 
}

About

AlphaVerus: Formally Verified Code Generation through Self-Improving Translation and Treefinement

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages