Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin triton to v3.1.0 for HPU #795

Open
wants to merge 1 commit into
base: v1.19.1
Choose a base branch
from

Conversation

iboiko-habana
Copy link

@iboiko-habana iboiko-habana commented Feb 7, 2025

It is cherry-pick of #728

Resolves issue due to release of triton v3.2.0 (January 23rd, 2025). This is a workaround. A proper fix to support triton v3.2.0 may be required.
Error when triton v3.2.0 is used is shown below.

Traceback (most recent call last):
File "/workspace/vllm/test_evaluation.py", line 15, in
from vllm import LLM, SamplingParams
File "/workspace/vllm/vllm/init.py", line 7, in
from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
File "/workspace/vllm/vllm/engine/arg_utils.py", line 11, in
from vllm.config import (CacheConfig, ConfigFormat, DecodingConfig,
File "/workspace/vllm/vllm/config.py", line 16, in
from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS
File "/workspace/vllm/vllm/model_executor/layers/quantization/init.py", line 6, in
from vllm.model_executor.layers.quantization.awq_marlin import AWQMarlinConfig
File "/workspace/vllm/vllm/model_executor/layers/quantization/awq_marlin.py", line 6, in
import vllm.model_executor.layers.fused_moe # noqa
File "/workspace/vllm/vllm/model_executor/layers/fused_moe/init.py", line 34, in
import vllm.model_executor.layers.fused_moe.fused_marlin_moe # noqa
File "/workspace/vllm/vllm/model_executor/layers/fused_moe/fused_marlin_moe.py", line 8, in
from vllm.model_executor.layers.fused_moe.fused_moe import (
File "/workspace/vllm/vllm/model_executor/layers/fused_moe/fused_moe.py", line 18, in
from vllm_hpu_extension.ops import scaled_fp8_quant
File "/usr/local/lib/python3.10/dist-packages/vllm_hpu_extension/ops.py", line 9, in
import habana_frameworks.torch as htorch
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/init.py", line 54, in
import habana_frameworks.torch.core
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/init.py", line 114, in
import_compilers()
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/backends.py", line 39, in import_compilers
from .compilers import hpu_inference_compiler, hpu_training_compiler_bw, hpu_training_compiler_fw
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/compilers.py", line 27, in
from .freezing_passes import freeze
File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/freezing_passes.py", line 28, in
from torch._inductor.freezing import discard_traced_gm_params, invalidate_eager_modules, replace_params_with_constants
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/freezing.py", line 15, in
from torch._inductor.fx_passes.freezing_patterns import freezing_passes
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/fx_passes/freezing_patterns.py", line 5, in
from torch._inductor.compile_fx import fake_tensor_prop
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 49, in
from torch._inductor.debug import save_args_for_compile_fx_inner
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/debug.py", line 26, in
from . import config, ir # noqa: F811, this is needed
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/ir.py", line 77, in
from .runtime.hints import ReductionHint
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/runtime/hints.py", line 36, in
attr_desc_fields = {f.name for f in fields(AttrsDescriptor)}
File "/usr/lib/python3.10/dataclasses.py", line 1198, in fields
raise TypeError('must be called with a dataclass type or instance') from None
TypeError: must be called with a dataclass type or instance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants