Name		Name	Last commit message	Last commit date
parent directory ..
llava		llava
lm_eval		lm_eval
models		models
quantize		quantize
.gitignore		.gitignore
README.md		README.md
categories.py		categories.py
datautils.py		datautils.py
generate_act_scale_shift.py		generate_act_scale_shift.py
main.py		main.py
parallel_utils.py		parallel_utils.py
utils.py		utils.py

README.md

RLQLLM

Usage

We provide full script to run RLQuant. We use llama-7b as an example here:

Obtain the channel-wise scales and shifts required for initialization:

python generate_act_scale_shift.py --model /PATH/TO/llama/llama-7b

LRQuant

Weight-activation quantization

# W4A4 ppl
CUDA_VISIBLE_DEVICES=0 python main.py \
--model /PATH/TO/llama/llama-7b  \
--epochs 20 --output_dir ./log/llama-7b-w4a4 \
--eval_ppl --wbits 4 --abits 4 --lwc --let 

# W4A4 zero-shot
CUDA_VISIBLE_DEVICES=0 python main.py \
--model /PATH/TO/llama/llama-7b  \
--epochs 20 --output_dir ./log/llama-7b-w4a4 \
--wbits 4 --abits 4 --lwc --let \
--tasks piqa,arc_easy,arc_challenge,boolq,hellaswag,winogrande

LRQuant+

Weight-activation quantization

# W4A4 ppl
CUDA_VISIBLE_DEVICES=0 python main.py \
--model /PATH/TO/llama/llama-7b  \
--epochs 20 --output_dir ./log/llama-7b-w4a4 \
--eval_ppl --wbits 4 --abits 4 --lwc --let --lr_plus

# W4A4 zero-shot
CUDA_VISIBLE_DEVICES=0 python main.py \
--model /PATH/TO/llama/llama-7b  \
--epochs 20 --output_dir ./log/llama-7b-w4a4 \
--wbits 4 --abits 4 --lwc --let --lr_plus \
--tasks piqa,arc_easy,arc_challenge,boolq,hellaswag,winogrande

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LRLLM

LRLLM

README.md

RLQLLM

Usage

LRQuant

LRQuant+

Related Project

Files

LRLLM

Directory actions

More options

Directory actions

More options

Latest commit

History

LRLLM

Folders and files

parent directory

README.md

RLQLLM

Usage

LRQuant

LRQuant+

Related Project