Popular repositories Loading
-
Awesome-Efficient-LLM
Awesome-Efficient-LLM PublicForked from horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Python
-
auto-round
auto-round PublicForked from intel/auto-round
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Python
-
EfficientDM
EfficientDM PublicForked from ThisisBillhe/EfficientDM
[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models"
Jupyter Notebook
-
Quest
Quest PublicForked from mit-han-lab/Quest
[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Cuda
-
llmc
llmc PublicForked from ModelTC/llmc
This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Python
-
If the problem persists, check the GitHub status page or contact support.