vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.1k
Star 33.2k

Code
Issues 1.2k
Pull requests 453
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024

#9006 opened Oct 1, 2024 by simon-mo

Open 26

vLLM's V1 Engine Architecture

#8779 opened Sep 24, 2024 by simon-mo

Open 9

Labels 56 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,190 Open 4,634 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: base64 string leads to gibberish with latest vLLM server and pixtral-12b bug

Something isn't working

#11781 opened Jan 6, 2025 by michael-brunzel

1 task done

[Bug]: prompt logprobs are different with batch_size > 1 compared to batch_size=1 bug

Something isn't working

#11778 opened Jan 6, 2025 by rizar

1 task done

[Bug]: bug

Something isn't working

#11775 opened Jan 6, 2025 by kzos

1 task done

[Usage]: Running OpenAI Swarm with vLLM-hosted models usage

How to use vllm

#11774 opened Jan 6, 2025 by ArturDev42

1 task done

[Feature]: Qwen2-VL-72B-Instruct-GPTQ-Int4 model runs very slowly in A100 machine 80GB feature request

#11767 opened Jan 6, 2025 by Dineshkumar-Anandan-ZS0367

[Misc]: Finetuned llama3.2 vision instruct model is failing during VLLM weight_loader misc

#11765 opened Jan 6, 2025 by nkumar15

1 task done

[Bug]: Cutlass 2:4 Sparsity + FP8/Int8 Quant RuntimeError: Error Internal bug

Something isn't working

#11763 opened Jan 6, 2025 by leoyuppieqnew

1 task done

[Bug]: After successfully loading the LoRA module with load_lora_adapter, the result returned by v1/models does not include this LoRA module. bug

Something isn't working

#11761 opened Jan 6, 2025 by Excuses123

1 task done

[Misc]: Very High GPU RX/TX using vllm misc

#11760 opened Jan 6, 2025 by alexpong0630

1 task done

[Doc]: Why NGramWorker does not support cache operations documentation

Improvements or additions to documentation

#11758 opened Jan 6, 2025 by kuangdao

1 task done

[Bug]: CPU Offload fails when enable_lora=True bug

Something isn't working

#11748 opened Jan 5, 2025 by Neko-nos

1 task done

[Misc]: why there two multi_gpu_barrier in cross_device_reduce_1stage? misc

#11747 opened Jan 5, 2025 by leizhao1234

1 task done

[Performance]: Context Length Problem with VLLM performance

Performance-related issues

#11745 opened Jan 5, 2025 by MotorBottle

1 task done

[Installation]: XPU dependencies not built against most recent oneAPI installation

Installation problems

#11734 opened Jan 4, 2025 by janimo

1 task done

[Usage]: serving 'LLaVA-Next-Video-7B-Qwen2' usage

How to use vllm

#11731 opened Jan 4, 2025 by Noctis-SC

1 task done

[Feature]: Does vLLM plan to support host multiple llm base models inside one server feature request

#11729 opened Jan 4, 2025 by ynwang007

1 task done

[Feature]: Publish an Arm image for GH200 feature request

#11728 opened Jan 3, 2025 by samos123

1 task done

[Bug]: PixtralHF inference broken since #11396 bug

Something isn't working

#11726 opened Jan 3, 2025 by mgoin

1 task done

[New Model]: unsloth/Llama-3.3-70B-Instruct-bnb-4bit new model

Requests to new models

#11725 opened Jan 3, 2025 by Hyfred

1 task done

[Feature]: membind all NUMA nodes for all CPUs in list feature request

#11720 opened Jan 3, 2025 by hpcpony

1 task done

[Bug]: ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected bug

Something isn't working

#11715 opened Jan 3, 2025 by npanpaliya

1 task done

[Bug]: Extremely slow inference speed when deploying with vLLM on 16 H100 GPUs according to instructions on DeepSeekV3 bug

Something isn't working

#11705 opened Jan 3, 2025 by yonghenglh6

1 task done

[Bug]: 0.6.6.post1 crash in marlin_utils.py bug

Something isn't working

#11703 opened Jan 3, 2025 by Flynn-Zh

1 task done

[Bug]: vLLM LoRA Crash when using Dynamic Loading bug

Something isn't working

#11702 opened Jan 3, 2025 by haitwang-cloud

1 task done

[Feature]: The tool_choice option required is not yet supported but on the roadmap. feature request

#11700 opened Jan 3, 2025 by yumc2573

1 task done

Previous 1 2 3 4 5 … 47 48 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly