请问vLLM proc的部分可以放在两张卡上吗？ #138

rrustlee · 2025-02-25T10:44:35Z

在下面这部分代码中，torch.distributed.get_world_size被设置成1，但是一张卡80G放不下比较长的prompt_length，想放在两张卡上，如果直接"torch.distributed.get_world_size", return_value=1将这个部分改成2并设置对应的环境变量会导致通讯问题，直接卡住，请问有办法解决吗？

world_size_patch = patch(
                    "torch.distributed.get_world_size", return_value=1
                )
                profiling_patch = patch(
                    "vllm.worker.worker.Worker._assert_memory_footprint_increased_during_profiling",
                    return_value=None,
                )
                with world_size_patch, profiling_patch:
                    print("vllm is running on: ", vllm_device)
                    self.llm = LLM(
                        model=model.name_or_path,
                        device=vllm_device,
                        gpu_memory_utilization=self.args.vllm_gpu_memory_utilization,
                        dtype=torch.bfloat16,
                        # Automatic Prefix Caching caches the KV cache of existing queries, so that a new query can
                        # directly reuse the KV cache if it shares the same prefix with one of the existing queries.
                        # This is particularly useful here because we generate completions from the same prompts.
                        enable_prefix_caching=True,
                        enforce_eager=True,
                        # Ensure that training and inference use the same processor for images.
                        mm_processor_kwargs=(
                            {
                                "max_pixels": max_pixels,
                                "min_pixels": min_pixels,
                            }
                            if "Qwen2-VL" in model_id or "Qwen2.5-VL" in model_id
                            else None
                        ),
                        max_model_len=args.max_completion_length,
                    )

The text was updated successfully, but these errors were encountered:

rrustlee closed this as completed Feb 25, 2025

rrustlee reopened this Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

请问vLLM proc的部分可以放在两张卡上吗？ #138

请问vLLM proc的部分可以放在两张卡上吗？ #138

rrustlee commented Feb 25, 2025

请问vLLM proc的部分可以放在两张卡上吗？ #138

请问vLLM proc的部分可以放在两张卡上吗？ #138

Comments

rrustlee commented Feb 25, 2025