Pin triton to v3.1.0 for HPU #795

iboiko-habana · 2025-02-07T10:29:49Z

It is cherry-pick of #728

Resolves issue due to release of triton v3.2.0 (January 23rd, 2025). This is a workaround. A proper fix to support triton v3.2.0 may be required.
Error when triton v3.2.0 is used is shown below.

Traceback (most recent call last):
File "/workspace/vllm/test_evaluation.py", line 15, in
from vllm import LLM, SamplingParams
File "/workspace/vllm/vllm/init.py", line 7, in
from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
File "/workspace/vllm/vllm/engine/arg_utils.py", line 11, in
from vllm.config import (CacheConfig, ConfigFormat, DecodingConfig,
File "/workspace/vllm/vllm/config.py", line 16, in
from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS
File "/workspace/vllm/vllm/model_executor/layers/quantization/init.py", line 6, in
from vllm.model_executor.layers.quantization.awq_marlin import AWQMarlinConfig
File "/workspace/vllm/vllm/model_executor/layers/quantization/awq_marlin.py", line 6, in
import vllm.model_executor.layers.fused_moe # noqa
File "/workspace/vllm/vllm/model_executor/layers/fused_moe/init.py", line 34, in
import vllm.model_executor.layers.fused_moe.fused_marlin_moe # noqa
File "/workspace/vllm/vllm/model_executor/layers/fused_moe/fused_marlin_moe.py", line 8, in
from vllm.model_executor.layers.fused_moe.fused_moe import (
File "/workspace/vllm/vllm/model_executor/layers/fused_moe/fused_moe.py", line 18, in
from vllm_hpu_extension.ops import scaled_fp8_quant
File "/usr/local/lib/python3.10/dist-packages/vllm_hpu_extension/ops.py", line 9, in
import habana_frameworks.torch as htorch
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/init.py", line 54, in
import habana_frameworks.torch.core
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/init.py", line 114, in
import_compilers()
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/backends.py", line 39, in import_compilers
from .compilers import hpu_inference_compiler, hpu_training_compiler_bw, hpu_training_compiler_fw
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/compilers.py", line 27, in
from .freezing_passes import freeze
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/freezing_passes.py", line 28, in
from torch._inductor.freezing import discard_traced_gm_params, invalidate_eager_modules, replace_params_with_constants
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/freezing.py", line 15, in
from torch._inductor.fx_passes.freezing_patterns import freezing_passes
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/fx_passes/freezing_patterns.py", line 5, in
from torch._inductor.compile_fx import fake_tensor_prop
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 49, in
from torch._inductor.debug import save_args_for_compile_fx_inner
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/debug.py", line 26, in
from . import config, ir # noqa: F811, this is needed
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/ir.py", line 77, in
from .runtime.hints import ReductionHint
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/runtime/hints.py", line 36, in
attr_desc_fields = {f.name for f in fields(AttrsDescriptor)}
File "/usr/lib/python3.10/dataclasses.py", line 1198, in fields
raise TypeError('must be called with a dataclass type or instance') from None
TypeError: must be called with a dataclass type or instance

Pin triton to v3.1.0 for HPU

2c8f8c3

iboiko-habana requested review from kzawora-intel, madamczykhabana, michalkuligowski, mgawarkiewicz, vivekgoe and piotrbocian as code owners February 7, 2025 10:29

mgawarkiewicz approved these changes Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pin triton to v3.1.0 for HPU #795

Pin triton to v3.1.0 for HPU #795

iboiko-habana commented Feb 7, 2025 •

edited by github-actions bot

Loading

Pin triton to v3.1.0 for HPU #795

Are you sure you want to change the base?

Pin triton to v3.1.0 for HPU #795

Conversation

iboiko-habana commented Feb 7, 2025 • edited by github-actions bot Loading

iboiko-habana commented Feb 7, 2025 •

edited by github-actions bot

Loading