how to enable llama3-8b int4 awq models #90

FlexLaughing · 2024-08-29T03:59:51Z

Hi ,
I got an auto-awq models (--wbits=4 --groupsize=128),and using command to run the ppl base on gpu card,
--model /home/ubuntu/qllm_v0.2.0_Llama3-8B-Chinese-Chat_q4 --epochs 0 --eval_ppl --wbits 4 --abits 16 --lwc --net llama-7b
met an error when parse https://github.com/OpenGVLab/OmniQuant/blob/main/quantize/int_linear.py#L26
seems QuantLinear define not support qweight for autoawq, Please have a check for the args, Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to enable llama3-8b int4 awq models #90

how to enable llama3-8b int4 awq models #90

FlexLaughing commented Aug 29, 2024

how to enable llama3-8b int4 awq models #90

how to enable llama3-8b int4 awq models #90

Comments

FlexLaughing commented Aug 29, 2024