Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLOCKER] Fix in v1.19.1 for dataclass error due to triton package update #727

Merged
merged 1 commit into from
Jan 24, 2025

Conversation

MohitIntel
Copy link

@MohitIntel MohitIntel commented Jan 23, 2025

The triton package within requirements-hpu.txt got an update to version 3.2.0 today (https://pypi.org/project/triton/#history).
This breaks our current 1.19 based dockers on Gaudi 2 (probably Gaudi3 as well) and prevents vllm from running any example. (offline/online both).
Fixing this package version to 3.1.0 solves the issue.

A user trying a fresh install of vllm would otherwise see this message when trying to launch vllm offline_inference.py example on 1.19 dockers (Seen on vllm-fork v1.19.0/1/2 branch)

Traceback (most recent call last):
  File "/root/vllm-fork/examples/offline_inference.py", line 3, in <module>
    from vllm import LLM, SamplingParams
  File "/root/vllm-fork/vllm/__init__.py", line 7, in <module>
    from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
  File "/root/vllm-fork/vllm/engine/arg_utils.py", line 11, in <module>
    from vllm.config import (CacheConfig, ConfigFormat, DecodingConfig,
  File "/root/vllm-fork/vllm/config.py", line 16, in <module>
    from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS
  File "/root/vllm-fork/vllm/model_executor/layers/quantization/__init__.py", line 5, in <module>
    from vllm.model_executor.layers.quantization.awq_marlin import AWQMarlinConfig
  File "/root/vllm-fork/vllm/model_executor/layers/quantization/awq_marlin.py", line 6, in <module>
    import vllm.model_executor.layers.fused_moe  # noqa
  File "/root/vllm-fork/vllm/model_executor/layers/fused_moe/__init__.py", line 34, in <module>
    import vllm.model_executor.layers.fused_moe.fused_marlin_moe  # noqa
  File "/root/vllm-fork/vllm/model_executor/layers/fused_moe/fused_marlin_moe.py", line 8, in <module>
    from vllm.model_executor.layers.fused_moe.fused_moe import (
  File "/root/vllm-fork/vllm/model_executor/layers/fused_moe/fused_moe.py", line 18, in <module>
    from vllm_hpu_extension.ops import scaled_fp8_quant
  File "/usr/local/lib/python3.10/dist-packages/vllm_hpu_extension/ops.py", line 9, in <module>
    import habana_frameworks.torch as htorch
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/__init__.py", line 54, in <module>
    import habana_frameworks.torch.core
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/core/__init__.py", line 114, in <module>
    import_compilers()
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/backends.py", line 39, in import_compilers
    from .compilers import hpu_inference_compiler, hpu_training_compiler_bw, hpu_training_compiler_fw
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/compilers.py", line 27, in <module>
    from .freezing_passes import freeze
  File "/usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/dynamo/compile_backend/freezing_passes.py", line 28, in <module>
    from torch._inductor.freezing import discard_traced_gm_params, invalidate_eager_modules, replace_params_with_constants
  File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/freezing.py", line 15, in <module>
    from torch._inductor.fx_passes.freezing_patterns import freezing_passes
  File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/fx_passes/freezing_patterns.py", line 5, in <module>
    from torch._inductor.compile_fx import fake_tensor_prop
  File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 49, in <module>
    from torch._inductor.debug import save_args_for_compile_fx_inner
  File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/debug.py", line 26, in <module>
    from . import config, ir  # noqa: F811, this is needed
  File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/ir.py", line 77, in <module>
    from .runtime.hints import ReductionHint
  File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/runtime/hints.py", line 36, in <module>
    attr_desc_fields = {f.name for f in fields(AttrsDescriptor)}
  File "/usr/lib/python3.10/dataclasses.py", line 1198, in fields
    raise TypeError('must be called with a dataclass type or instance') from None
TypeError: must be called with a dataclass type or instance

@michalkuligowski michalkuligowski merged commit 1ea378e into v1.19.1 Jan 24, 2025
7 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants