LoRA Instruction Tuning

This repo contains some simple Python code (based upon HuggingFace) for instruction tuning common LLMs with LoRA/QLoRA. The repo contains training code, as well as several different scripts for evaluating model generations.

Setup • Details • Usage • Future Work

Setup

Install necessary dependencies as follows

> conda create -n lora_tuning python=3.11 anaconda
> conda activate lora_tuning
> pip install -r requirements.txt

Details

The repo support instruction tuning with LoRA and QLoRA, based upon the PEFT (from HuggingFace) and bitsandbytes. Currently, the example scripts instruction tune the Mistral-7B model, though other models can be specified via the --model_name_or_path argument.

A breakdown of the main files within the resposity is as follows...

File Description

train.py Main training code

generate.py Script for examining model output

setup.py Functions for downloading and configuring models/tokenizers

data.py Code for configuring datasets

./scripts Scripts for training/evaluation

See all scripts...
train.sh: run instruction tuning (2x3090 GPUs)
generate.sh: examine model outputs

./data Supplemental data files

See all files...
vicuna_questions.json: evaluation questions from Vicuna

The training process supports either the Alpaca or Assistant Chatbot dataset. Evaluation is performed using the set of questions proposed for evaluating Vicuna (see here). However, model outputs can be observed over arbitrary datasets by leveraging the generate.py script. The training process logs all metrics to wandb (assuming --report_to wandb is specified in the arguments), as well as generates model outputs for the vicuna evaluation set that are logged to wandb at the end of training.

Usage

Example scripts are located in the ./scripts folder and can be run as follows:

> bash ./scripts/train.sh
> bash ./scripts/generate.sh

These scripts can also be customized by tweaking their arguments. See args.py for a full list of arguments for the model, training, data, and generation.

Future Work

This repository is very simplistic for now. Future efforts will likely include:

Expansion to more datasets (for training and evaluation)
Implementing an LLM-as-a-judge style evaluation pipeline
Adding evaluation on MMLU
Try out LoRA+ with different learning rates for A + B matrices

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LoRA Instruction Tuning

Setup

Details

Usage

Future Work

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
scripts		scripts
README.md		README.md
args.py		args.py
data.py		data.py
generate.py		generate.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

File	Description
train.py	Main training code
generate.py	Script for examining model output
setup.py	Functions for downloading and configuring models/tokenizers
data.py	Code for configuring datasets
./scripts	Scripts for training/evaluation
	See all scripts... train.sh: run instruction tuning (2x3090 GPUs) generate.sh: examine model outputs
./data	Supplemental data files
	See all files... vicuna_questions.json: evaluation questions from Vicuna

wolfecameron/lora_instruction_tune

Folders and files

Latest commit

History

Repository files navigation

LoRA Instruction Tuning

Setup

Details

Usage

Future Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages