Stars
π LLM
7 repositories
Awesome-LLM: a curated list of Large Language Model
πA curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. ππ
A high-throughput and memory-efficient inference and serving engine for LLMs
Fast and memory-efficient exact attention
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
SGLang is a fast serving framework for large language models and vision language models.