-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugs when use vllm with qwenvl2.5 #68
Comments
the script is here: # cd src/open-r1-multimodal
export DEBUG_MODE="true"
export LOG_PATH="./debug_log_2b.txt"
export WANDB_PROJECT=vision-reasoning
export WANDB_RUN_NAME=Qwen-VL-2B-GRPO-CLEVR-70k-$(date +%Y-%m-%d-%H-%M-%S)
torchrun --nproc_per_node="7" \
--nnodes="1" \
--node_rank="0" \
--master_addr="127.0.0.1" \
--master_port="12345" \
./grpo.py \
--output_dir checkpoints/${WANDB_RUN_NAME} \
--model_name_or_path ./models/Qwen2.5-VL-3B-Instruct \
--dataset_name ./datasets/clevr_cogen_a_train \
--max_prompt_length 1024 \
--per_device_train_batch_size 1 \
--gradient_accumulation_steps 2 \
--logging_steps 1 \
--bf16 \
--use_vllm \
--num_generations 7 \
--report_to wandb \
--gradient_checkpointing false \
--attn_implementation flash_attention_2 \
--max_pixels 401408 \
--num_train_epochs 2 \
--run_name $WANDB_RUN_NAME \
--save_steps 100 \
--save_only_model true |
It seems that vllm-grpo-trainer doesn't support qwen2.5-vl. Should change the trainer to use correct Qwen2VLForConditionalGeneration. For example: |
I found that the # Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
absl-py 2.1.0 pypi_0 pypi
accelerate 1.3.0 pypi_0 pypi
aenum 3.1.15 pypi_0 pypi
aiohappyeyeballs 2.4.6 pypi_0 pypi
aiohttp 3.11.12 pypi_0 pypi
aiohttp-cors 0.7.0 pypi_0 pypi
aiosignal 1.3.2 pypi_0 pypi
airportsdata 20241001 pypi_0 pypi
annotated-types 0.7.0 pypi_0 pypi
antlr4-python3-runtime 4.13.2 pypi_0 pypi
anyio 4.8.0 pypi_0 pypi
astor 0.8.1 pypi_0 pypi
attrs 25.1.0 pypi_0 pypi
av 14.1.0 pypi_0 pypi
bitsandbytes 0.45.2 pypi_0 pypi
black 25.1.0 pypi_0 pypi
blake3 1.0.4 pypi_0 pypi
blis 0.7.11 pypi_0 pypi
bzip2 1.0.8 h5eee18b_6
ca-certificates 2024.12.31 h06a4308_0
cachetools 5.5.1 pypi_0 pypi
catalogue 2.0.10 pypi_0 pypi
certifi 2025.1.31 pypi_0 pypi
chardet 5.2.0 pypi_0 pypi
charset-normalizer 3.4.1 pypi_0 pypi
click 8.1.8 pypi_0 pypi
cloudpathlib 0.16.0 pypi_0 pypi
cloudpickle 3.1.1 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
colorful 0.5.6 pypi_0 pypi
colorlog 6.9.0 pypi_0 pypi
compressed-tensors 0.9.1 pypi_0 pypi
confection 0.1.5 pypi_0 pypi
cymem 2.0.11 pypi_0 pypi
dataproperty 1.1.0 pypi_0 pypi
datasets 3.2.0 pypi_0 pypi
deepspeed 0.15.4 pypi_0 pypi
depyf 0.18.0 pypi_0 pypi
dill 0.3.8 pypi_0 pypi
diskcache 5.6.3 pypi_0 pypi
distlib 0.3.9 pypi_0 pypi
distro 1.9.0 pypi_0 pypi
docker-pycreds 0.4.0 pypi_0 pypi
einops 0.8.1 pypi_0 pypi
fastapi 0.115.8 pypi_0 pypi
filelock 3.17.0 pypi_0 pypi
flake8 7.1.1 pypi_0 pypi
flash-attn 2.5.8 pypi_0 pypi
frozenlist 1.5.0 pypi_0 pypi
fsspec 2024.9.0 pypi_0 pypi
gguf 0.10.0 pypi_0 pypi
gitdb 4.0.12 pypi_0 pypi
gitpython 3.1.44 pypi_0 pypi
google-api-core 2.24.1 pypi_0 pypi
google-auth 2.38.0 pypi_0 pypi
googleapis-common-protos 1.67.0rc1 pypi_0 pypi
grpcio 1.70.0 pypi_0 pypi
h11 0.14.0 pypi_0 pypi
hf-transfer 0.1.9 pypi_0 pypi
hjson 3.1.0 pypi_0 pypi
httpcore 1.0.7 pypi_0 pypi
httptools 0.6.4 pypi_0 pypi
httpx 0.28.1 pypi_0 pypi
huggingface-hub 0.28.1 pypi_0 pypi
idna 3.10 pypi_0 pypi
importlib-metadata 8.6.1 pypi_0 pypi
iniconfig 2.0.0 pypi_0 pypi
inquirerpy 0.3.4 pypi_0 pypi
interegular 0.3.3 pypi_0 pypi
isort 6.0.0 pypi_0 pypi
jinja2 3.1.5 pypi_0 pypi
jiter 0.8.2 pypi_0 pypi
joblib 1.4.2 pypi_0 pypi
jsonschema 4.23.0 pypi_0 pypi
jsonschema-specifications 2024.10.1 pypi_0 pypi
langcodes 3.5.0 pypi_0 pypi
language-data 1.3.0 pypi_0 pypi
lark 1.2.2 pypi_0 pypi
latex2sympy2-extended 1.0.6 pypi_0 pypi
ld_impl_linux-64 2.40 h12ee557_0
libffi 3.4.4 h6a678d5_1
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libstdcxx-ng 11.2.0 h1234567_1
libuuid 1.41.5 h5eee18b_0
liger-kernel 0.5.2 pypi_0 pypi
lighteval 0.7.0 pypi_0 pypi
lm-format-enforcer 0.10.9 pypi_0 pypi
lxml 5.3.1 pypi_0 pypi
marisa-trie 1.2.1 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 3.0.2 pypi_0 pypi
math-verify 0.5.2 pypi_0 pypi
mbstrdecoder 1.1.4 pypi_0 pypi
mccabe 0.7.0 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
mistral-common 1.5.3 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
msgpack 1.1.0 pypi_0 pypi
msgspec 0.19.0 pypi_0 pypi
multidict 6.1.0 pypi_0 pypi
multiprocess 0.70.16 pypi_0 pypi
murmurhash 1.0.12 pypi_0 pypi
mypy-extensions 1.0.0 pypi_0 pypi
ncurses 6.4 h6a678d5_0
nest-asyncio 1.6.0 pypi_0 pypi
networkx 3.4.2 pypi_0 pypi
ninja 1.11.1.3 pypi_0 pypi
nltk 3.9.1 pypi_0 pypi
numpy 1.26.4 pypi_0 pypi
nvidia-cublas-cu12 12.4.5.8 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.4.127 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.4.127 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.4.127 pypi_0 pypi
nvidia-cudnn-cu12 9.1.0.70 pypi_0 pypi
nvidia-cufft-cu12 11.2.1.3 pypi_0 pypi
nvidia-curand-cu12 10.3.5.147 pypi_0 pypi
nvidia-cusolver-cu12 11.6.1.9 pypi_0 pypi
nvidia-cusparse-cu12 12.3.1.170 pypi_0 pypi
nvidia-cusparselt-cu12 0.6.2 pypi_0 pypi
nvidia-ml-py 12.570.86 pypi_0 pypi
nvidia-nccl-cu12 2.21.5 pypi_0 pypi
nvidia-nvjitlink-cu12 12.4.127 pypi_0 pypi
nvidia-nvtx-cu12 12.4.127 pypi_0 pypi
openai 1.61.1 pypi_0 pypi
opencensus 0.11.4 pypi_0 pypi
opencensus-context 0.1.3 pypi_0 pypi
opencv-python-headless 4.11.0.86 pypi_0 pypi
openssl 3.0.15 h5eee18b_0
outlines 0.1.11 pypi_0 pypi
outlines-core 0.1.26 pypi_0 pypi
packaging 24.2 pypi_0 pypi
pandas 2.2.3 pypi_0 pypi
parameterized 0.9.0 pypi_0 pypi
partial-json-parser 0.2.1.1.post5 pypi_0 pypi
pathspec 0.12.1 pypi_0 pypi
pathvalidate 3.2.3 pypi_0 pypi
pfzy 0.3.4 pypi_0 pypi
pillow 11.1.0 pypi_0 pypi
pip 25.0 py311h06a4308_0
platformdirs 4.3.6 pypi_0 pypi
pluggy 1.5.0 pypi_0 pypi
portalocker 3.1.1 pypi_0 pypi
preshed 3.0.9 pypi_0 pypi
prometheus-client 0.21.1 pypi_0 pypi
prometheus-fastapi-instrumentator 7.0.2 pypi_0 pypi
prompt-toolkit 3.0.50 pypi_0 pypi
propcache 0.2.1 pypi_0 pypi
proto-plus 1.26.0 pypi_0 pypi
protobuf 3.20.3 pypi_0 pypi
psutil 6.1.1 pypi_0 pypi
py-cpuinfo 9.0.0 pypi_0 pypi
py-spy 0.4.0 pypi_0 pypi
pyarrow 19.0.0 pypi_0 pypi
pyasn1 0.6.1 pypi_0 pypi
pyasn1-modules 0.4.1 pypi_0 pypi
pybind11 2.13.6 pypi_0 pypi
pycodestyle 2.12.1 pypi_0 pypi
pycountry 24.6.1 pypi_0 pypi
pydantic 2.10.6 pypi_0 pypi
pydantic-core 2.27.2 pypi_0 pypi
pyflakes 3.2.0 pypi_0 pypi
pygments 2.19.1 pypi_0 pypi
pytablewriter 1.2.1 pypi_0 pypi
pytest 8.3.4 pypi_0 pypi
python 3.11.11 he870216_0
python-dateutil 2.9.0.post0 pypi_0 pypi
python-dotenv 1.0.1 pypi_0 pypi
pytz 2025.1 pypi_0 pypi
pyyaml 6.0.2 pypi_0 pypi
pyzmq 26.2.1 pypi_0 pypi
qwen-vl-utils 0.0.10 pypi_0 pypi
ray 2.42.1 pypi_0 pypi
readline 8.2 h5eee18b_0
referencing 0.36.2 pypi_0 pypi
regex 2024.11.6 pypi_0 pypi
requests 2.32.3 pypi_0 pypi
rich 13.9.4 pypi_0 pypi
rouge-score 0.1.2 pypi_0 pypi
rpds-py 0.22.3 pypi_0 pypi
rsa 4.9 pypi_0 pypi
sacrebleu 2.5.1 pypi_0 pypi
safetensors 0.5.2 pypi_0 pypi
scikit-learn 1.6.1 pypi_0 pypi
scipy 1.15.1 pypi_0 pypi
sentencepiece 0.2.0 pypi_0 pypi
sentry-sdk 2.20.0 pypi_0 pypi
setproctitle 1.3.4 pypi_0 pypi
setuptools 75.8.0 py311h06a4308_0
six 1.17.0 pypi_0 pypi
smart-open 6.4.0 pypi_0 pypi
smmap 5.0.2 pypi_0 pypi
sniffio 1.3.1 pypi_0 pypi
spacy 3.7.2 pypi_0 pypi
spacy-legacy 3.0.12 pypi_0 pypi
spacy-loggers 1.0.5 pypi_0 pypi
sqlite 3.45.3 h5eee18b_0
srsly 2.5.1 pypi_0 pypi
starlette 0.45.3 pypi_0 pypi
sympy 1.13.1 pypi_0 pypi
tabledata 1.3.4 pypi_0 pypi
tabulate 0.9.0 pypi_0 pypi
tcolorpy 0.1.7 pypi_0 pypi
tensorboardx 2.6.2.2 pypi_0 pypi
termcolor 2.3.0 pypi_0 pypi
thinc 8.2.5 pypi_0 pypi
threadpoolctl 3.5.0 pypi_0 pypi
tiktoken 0.8.0 pypi_0 pypi
tk 8.6.14 h39e8969_0
tokenizers 0.21.0 pypi_0 pypi
torch 2.5.1 pypi_0 pypi
torchaudio 2.5.1 pypi_0 pypi
torchvision 0.20.1 pypi_0 pypi
tqdm 4.67.1 pypi_0 pypi
transformers 4.49.0.dev0 pypi_0 pypi
triton 3.1.0 pypi_0 pypi
trl 0.15.0.dev0 pypi_0 pypi
typepy 1.3.4 pypi_0 pypi
typer 0.9.4 pypi_0 pypi
typing-extensions 4.12.2 pypi_0 pypi
tzdata 2025.1 pypi_0 pypi
urllib3 2.3.0 pypi_0 pypi
uvicorn 0.34.0 pypi_0 pypi
uvloop 0.21.0 pypi_0 pypi
virtualenv 20.29.2 pypi_0 pypi
vllm 0.7.2 pypi_0 pypi
wandb 0.18.3 pypi_0 pypi
wasabi 1.1.3 pypi_0 pypi
watchfiles 1.0.4 pypi_0 pypi
wcwidth 0.2.13 pypi_0 pypi
weasel 0.3.4 pypi_0 pypi
websockets 14.2 pypi_0 pypi
wheel 0.45.1 py311h06a4308_0
xformers 0.0.28.post3 pypi_0 pypi
xgrammar 0.1.11 pypi_0 pypi
xxhash 3.5.0 pypi_0 pypi
xz 5.4.6 h5eee18b_1
yarl 1.18.3 pypi_0 pypi
zipp 3.21.0 pypi_0 pypi
zlib 1.2.13 h5eee18b_1 |
Hi, the current vLLM trainer does NOT support Qwen2.5-VL for now. Will integrate it soon (hopefully tonight 😄 ). PR is also welcome if you can help!! |
Nice work and thank you for your reply!!! I'll try to support it with vllm-grpo-trainer if I could make it work😄. |
Hi, is there any progress on this issue? |
no 😿, I'm trying to make it running with Qwen2-VL. I found some hidden problems when using Qwen2-VL in this issue #73 |
@TobiasLee Hi Lei, thanks for the awesome work! Just curious, do we have any updates regarding the support for Qwen2.5-VL in the vLLM trainer (https://github.com/Deep-Agent/R1-V/blob/main/src/scripts/run_grpo_vllm.sh)? |
@weizhepei Hi, I create a PR to support Qwen2.5-VL in vLLM trainer, you can see the code at here #104 . |
Can you use your code to train Qwen2.5_VL? |
Yes, I am currently using qwen2.5-vl in my experiment. |
Have you encountered an OOM issue with Qwen2.5-VL-3B? Even after setting the |
I modified it like you, but I still can`t use qwen2.5 |
Qwen2.5-VL is now supported through #136 |
The text was updated successfully, but these errors were encountered: