[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL #13533

wulipc · 2025-02-19T09:16:21Z

In Qwen2.5-VL online inference, the fps parameter in mm_processor_kwargs is essential for accurately calculating the second_pre_grid_t value. However, the OpenAI interface currently does not support passing mm_processor_kwargs via extra_body. This PR fixes this problem (the previous issue #11652).

You can now interact with the vLLM server using the following example, which has been self-tested and verified to work correctly. If additional test cases are required, please tell me where they should be added. @DarkLight1337

FIX #11652

import base64
import numpy as np
from PIL import Image
from io import BytesIO
from openai import OpenAI
from qwen_vl_utils import process_vision_info


# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8899/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)


video_messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "text", "text": "请用表格总结一下视频中的商品特点"},
        {
            "type": "video",
            "video": "https://duguang-labelling.oss-cn-shanghai.aliyuncs.com/qiansun/video_ocr/videos/50221078283.mp4",
            "total_pixels": 20480 * 28 * 28, "min_pixels": 16 * 28 * 2, 
            'fps': 3.0  # The default value is 2.0, but for demonstration purposes, we set it to 3.0.
        }]
    },
]


def prepare_message_for_vllm(content_messages):
    """
    The frame extraction logic for videos in `vLLM` differs from that of `qwen_vl_utils`.
    Here, we utilize `qwen_vl_utils` to extract video frames, with the `media_typ`e of the video explicitly set to `video/jpeg`.
    By doing so, vLLM will no longer attempt to extract frames from the input base64-encoded images.
    """
    vllm_messages, fps_list = [], []
    for message in content_messages:
        message_content_list = message["content"]
        if not isinstance(message_content_list, list):
            vllm_messages.append(message)
            continue

        new_content_list = []
        for part_message in message_content_list:
            if 'video' in part_message:
                video_message = [{'content': [part_message]}]
                image_inputs, video_inputs, video_kwargs = process_vision_info(video_message, return_video_kwargs=True)
                assert video_inputs is not None, "video_inputs should not be None"
                video_input = (video_inputs.pop()).permute(0, 2, 3, 1).numpy().astype(np.uint8)
                print("video_kwargs", video_kwargs, video_input.shape)
                fps_list.extend(video_kwargs.get('fps', []))

                # encode image with base64
                base64_frames = []
                for frame in video_input:
                    img = Image.fromarray(frame)
                    output_buffer = BytesIO()
                    img.save(output_buffer, format="jpeg")
                    byte_data = output_buffer.getvalue()
                    base64_str = base64.b64encode(byte_data).decode("utf-8")
                    base64_frames.append(base64_str)

                part_message = {
                    "type": "video_url",
                    "video_url": {"url": f"data:video/jpeg;base64,{','.join(base64_frames)}"}
                }
            new_content_list.append(part_message)
        message["content"] = new_content_list
        vllm_messages.append(message)
    return vllm_messages, {'fps': fps_list}


video_messages, video_kwargs = prepare_message_for_vllm(video_messages)
chat_response = client.chat.completions.create(
    model="Qwen/Qwen2.5-VL-7B-Instruct",
    messages=video_messages,
    extra_body={
        "mm_processor_kwargs": video_kwargs
    }
)
print("Chat response:", chat_response)

github-actions · 2025-02-19T09:16:35Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337 · 2025-02-19T09:23:05Z

Thanks, can you update the docs with this example? We don't have to test this as the code is straightforward enough.

wulipc · 2025-02-19T09:28:26Z

Thanks, can you update the docs with this example? We don't have to test this as the code is straightforward enough.

ok, which docs in vLLM should be updated? This case has been updated in our Qwen repo.

DarkLight1337 · 2025-02-19T09:32:44Z

You can add it under Online Serving of this page

DarkLight1337 · 2025-02-19T09:33:49Z

Also, it would be best to verify that your PR works with #13516 once it's merged.

wulipc · 2025-02-19T10:02:48Z

You can add it under Online Serving of this page

Currently, only the Qwen2.5-VL model requires passing the mm_processor_kwargs parameter, and the above case is a bit cumbersome. Given the specificity of this use case and to maintain the simplicity of the vLLM documentation, I prefer not to include this case in the vLLM documentation. If users have related needs, they can refer to this issue or the official Qwen documentation for more details.

ywang96 · 2025-02-19T10:15:40Z

You can add it under Online Serving of this page

Currently, only the Qwen2.5-VL model requires passing the mm_processor_kwargs parameter, and the above case is a bit cumbersome. Given the specificity of this use case and to maintain the simplicity of the vLLM documentation, I prefer not to include this case in the vLLM documentation. If users have related needs, they can refer to this issue or the official Qwen documentation for more details.

I think it's okay if you want to include this example in https://github.com/vllm-project/vllm/blob/main/examples/online_serving/openai_chat_completion_client_for_multimodal.py (maybe video_with_kwargs as another chat-type), but yea as you mentioned, it's probably also a better idea to update this in the README of https://github.com/QwenLM/Qwen2.5-VL since this is only relavant to Qwen2.5VL

wulipc · 2025-02-19T10:59:26Z

update this in the README of https://github.com/QwenLM/Qwen2.5-VL since this is only relavant to Qwen2.5VL

Let's keep it simple and only update this in the README of the Qwen2.5-VL: https://github.com/QwenLM/Qwen2.5-VL.

DarkLight1337 · 2025-02-19T14:02:17Z

I just realized that fps can be a list of float, can you update the model file with the correct type annotation? Otherwise LGTM

wulipc · 2025-02-20T02:18:50Z

Also, it would be best to verify that your PR works with #13516 once it's merged.

@DarkLight1337 I found that after merging the fix into the main branch, passing the fps parameter through mm_processor_kwargs (mm_processor_kwargs: {'fps': []}) results in the following error. I checked and found that it was caused by the fps parameter being of list type.

DarkLight1337 · 2025-02-20T02:23:52Z

Can you try converting it into a tuple inside the processor?

DarkLight1337 · 2025-02-20T02:47:08Z

Another way would be to construct HashableList classes that use the tuple of its elements as the hash (similar to HashableDict)

wulipc · 2025-02-20T02:52:19Z

HashableList

@DarkLight1337 I change the list to a tuple in the merge_mm_kwargs function. Using a tuple instead of HashableList should also be fine, right? Everything seems to be working fine so far, and after merging, the PR works with #13516.

DarkLight1337 · 2025-02-20T02:57:54Z

To follow HF's type hints, let's use HashableList

wulipc · 2025-02-20T03:20:54Z

To follow HF's type hints, let's use HashableList

done

ywang96

LGTM - Thanks for the continuous contribution to vLLM!

…ject#13533)

[fix] add mm_processor_kwargs to extra_body in Chat API.

595ef8a

mergify bot added the frontend label Feb 19, 2025

DarkLight1337 mentioned this pull request Feb 19, 2025

[Feature]: support image_embeds in openai api as well #13540

Open

1 task

松灵 added 2 commits February 20, 2025 09:41

Merge branch 'main' into add_extra_para_for_qwen25vl

6ac8635

[fix] update fps type annotation

3307fd2

DarkLight1337 approved these changes Feb 20, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) February 20, 2025 02:14

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 20, 2025

DarkLight1337 disabled auto-merge February 20, 2025 02:22

[fix] change list to tuple in func: merge_mm_kwargs

b657d83

[fix] change tuple to HashableList

df2d91e

[fix] fix typo

49db8f4

DarkLight1337 approved these changes Feb 20, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) February 20, 2025 03:40

ywang96 approved these changes Feb 20, 2025

View reviewed changes

wulipc mentioned this pull request Feb 20, 2025

[update] Support passing the fps parameter through the OpenAI API interface. QwenLM/Qwen2.5-VL#815

Merged

DarkLight1337 mentioned this pull request Feb 20, 2025

[Core][Frontend] Add Support for Inference Time mm_processor_kwargs #9131

Merged

simon-mo merged commit 041e294 into vllm-project:main Feb 20, 2025
45 of 48 checks passed

wulipc deleted the add_extra_para_for_qwen25vl branch February 20, 2025 07:11

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2025

[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL (vllm-pro…

f848031

…ject#13533)

thiyagu-lily mentioned this pull request Feb 20, 2025

[Usage]: Empty mm_placeholders when running Qwen2-VL-7B #12742

Closed

1 task

kerthcet pushed a commit to kerthcet/vllm that referenced this pull request Feb 21, 2025

[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL (vllm-pro…

f721082

…ject#13533)

ywang96 mentioned this pull request Feb 21, 2025

[Bugfix] Add mm_processor_kwargs to chat-related protocols #13644

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL #13533

[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL #13533

wulipc commented Feb 19, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 19, 2025

DarkLight1337 commented Feb 19, 2025

wulipc commented Feb 19, 2025 •

edited

Loading

DarkLight1337 commented Feb 19, 2025

DarkLight1337 commented Feb 19, 2025

wulipc commented Feb 19, 2025

ywang96 commented Feb 19, 2025

wulipc commented Feb 19, 2025

DarkLight1337 commented Feb 19, 2025 •

edited

Loading

wulipc commented Feb 20, 2025

DarkLight1337 commented Feb 20, 2025 •

edited

Loading

DarkLight1337 commented Feb 20, 2025

wulipc commented Feb 20, 2025 •

edited

Loading

DarkLight1337 commented Feb 20, 2025

wulipc commented Feb 20, 2025

ywang96 left a comment

[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL #13533

[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL #13533

Conversation

wulipc commented Feb 19, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 19, 2025

DarkLight1337 commented Feb 19, 2025

wulipc commented Feb 19, 2025 • edited Loading

DarkLight1337 commented Feb 19, 2025

DarkLight1337 commented Feb 19, 2025

wulipc commented Feb 19, 2025

ywang96 commented Feb 19, 2025

wulipc commented Feb 19, 2025

DarkLight1337 commented Feb 19, 2025 • edited Loading

wulipc commented Feb 20, 2025

DarkLight1337 commented Feb 20, 2025 • edited Loading

DarkLight1337 commented Feb 20, 2025

wulipc commented Feb 20, 2025 • edited Loading

DarkLight1337 commented Feb 20, 2025

wulipc commented Feb 20, 2025

ywang96 left a comment

Choose a reason for hiding this comment

wulipc commented Feb 19, 2025 •

edited by github-actions bot

Loading

wulipc commented Feb 19, 2025 •

edited

Loading

DarkLight1337 commented Feb 19, 2025 •

edited

Loading

DarkLight1337 commented Feb 20, 2025 •

edited

Loading

wulipc commented Feb 20, 2025 •

edited

Loading