Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quantization] Support loading AWQ, GPTQ, GGUF/GGML quantized models #85

Open
xwu99 opened this issue Jan 26, 2024 · 0 comments
Open

[Quantization] Support loading AWQ, GPTQ, GGUF/GGML quantized models #85

xwu99 opened this issue Jan 26, 2024 · 0 comments

Comments

@xwu99
Copy link
Contributor

xwu99 commented Jan 26, 2024

No description provided.

@xwu99 xwu99 changed the title [Quantization] Support loading AWQ and GPTQ quantized models [Quantization] Support loading AWQ, GPTQ, GGUF/GGML quantized models Jan 26, 2024
zhangjian94cn pushed a commit to zhangjian94cn/llm-on-ray that referenced this issue Feb 4, 2024
* enhance streaming output

Signed-off-by: jiafuzha <[email protected]>

* enhance streaming output

Signed-off-by: jiafuzha <[email protected]>

* update UI related code

* enhance streaming output

Signed-off-by: jiafuzha <[email protected]>

* enhance streaming output

Signed-off-by: jiafuzha <[email protected]>

---------

Signed-off-by: jiafuzha <[email protected]>
Co-authored-by: KepingYan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant