-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: base64 string leads to gibberish with latest vLLM server and pixtral-12b
bug
Something isn't working
#11781
opened Jan 6, 2025 by
michael-brunzel
1 task done
[Bug]: prompt logprobs are different with batch_size > 1 compared to batch_size=1
bug
Something isn't working
#11778
opened Jan 6, 2025 by
rizar
1 task done
[Usage]: Running OpenAI Swarm with vLLM-hosted models
usage
How to use vllm
#11774
opened Jan 6, 2025 by
ArturDev42
1 task done
[Misc]: Finetuned llama3.2 vision instruct model is failing during VLLM weight_loader
misc
#11765
opened Jan 6, 2025 by
nkumar15
1 task done
[Bug]: Cutlass 2:4 Sparsity + FP8/Int8 Quant RuntimeError: Error Internal
bug
Something isn't working
#11763
opened Jan 6, 2025 by
leoyuppieqnew
1 task done
[Bug]: After successfully loading the LoRA module with load_lora_adapter, the result returned by v1/models does not include this LoRA module.
bug
Something isn't working
#11761
opened Jan 6, 2025 by
Excuses123
1 task done
[Doc]: Why NGramWorker does not support cache operations
documentation
Improvements or additions to documentation
#11758
opened Jan 6, 2025 by
kuangdao
1 task done
[Bug]: CPU Offload fails when Something isn't working
enable_lora=True
bug
#11748
opened Jan 5, 2025 by
Neko-nos
1 task done
[Misc]: why there two multi_gpu_barrier in cross_device_reduce_1stage?
misc
#11747
opened Jan 5, 2025 by
leizhao1234
1 task done
[Performance]: Context Length Problem with VLLM
performance
Performance-related issues
#11745
opened Jan 5, 2025 by
MotorBottle
1 task done
[Installation]: XPU dependencies not built against most recent oneAPI
installation
Installation problems
#11734
opened Jan 4, 2025 by
janimo
1 task done
[Usage]: serving 'LLaVA-Next-Video-7B-Qwen2'
usage
How to use vllm
#11731
opened Jan 4, 2025 by
Noctis-SC
1 task done
[Feature]: Does vLLM plan to support host multiple llm base models inside one server
feature request
#11729
opened Jan 4, 2025 by
ynwang007
1 task done
[Feature]: Publish an Arm image for GH200
feature request
#11728
opened Jan 3, 2025 by
samos123
1 task done
[Bug]: PixtralHF inference broken since #11396
bug
Something isn't working
#11726
opened Jan 3, 2025 by
mgoin
1 task done
[New Model]: unsloth/Llama-3.3-70B-Instruct-bnb-4bit
new model
Requests to new models
#11725
opened Jan 3, 2025 by
Hyfred
1 task done
[Feature]: membind all NUMA nodes for all CPUs in list
feature request
#11720
opened Jan 3, 2025 by
hpcpony
1 task done
[Bug]: ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected
bug
Something isn't working
#11715
opened Jan 3, 2025 by
npanpaliya
1 task done
[Bug]: Extremely slow inference speed when deploying with vLLM on 16 H100 GPUs according to instructions on DeepSeekV3
bug
Something isn't working
#11705
opened Jan 3, 2025 by
yonghenglh6
1 task done
[Bug]: 0.6.6.post1 crash in marlin_utils.py
bug
Something isn't working
#11703
opened Jan 3, 2025 by
Flynn-Zh
1 task done
[Bug]: vLLM LoRA Crash when using Dynamic Loading
bug
Something isn't working
#11702
opened Jan 3, 2025 by
haitwang-cloud
1 task done
[Feature]: The tool_choice option required is not yet supported but on the roadmap.
feature request
#11700
opened Jan 3, 2025 by
yumc2573
1 task done
Previous Next
ProTip!
Adding no:label will show everything without a label.