Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coverage] Value Error for aten_ops.scatter.src #3184

Closed
Tracked by #3179
chohk88 opened this issue Sep 26, 2024 · 0 comments · Fixed by #3251
Closed
Tracked by #3179

[Coverage] Value Error for aten_ops.scatter.src #3184

chohk88 opened this issue Sep 26, 2024 · 0 comments · Fixed by #3251
Assignees

Comments

@chohk88
Copy link
Collaborator

chohk88 commented Sep 26, 2024

2024-08-31 09:14:02.849 | ERROR    | MainProcess | /usr/local/lib/python3.10/dist-packages/torch_tensorrt/logging.py:24 - ITensor::getDimensions: Error Code 4: API Usage Error ([SCATTER]-[aten_ops.scatter.src]-[scatter_1_scatter_layer]: ScatterLayer in elements mode all inputs tensors rank must be same. Input 0 rank is 2, input 1 rank is 2, and input 2 rank is 1.)
2024-08-31 09:14:02.865 | ERROR    | MainProcess | /usr/local/lib/python3.10/dist-packages/torch_tensorrt/logging.py:24 - ITensor::getDimensions: Error Code 4: API Usage Error (Output shape can not be computed for node [SCATTER]-[aten_ops.scatter.src]-[scatter_1_scatter_layer].)
2024-08-31 09:14:02.880 | ERROR    | MainProcess | /usr/local/lib/python3.10/dist-packages/torch_tensorrt/logging.py:24 - ITensor::getDimensions: Error Code 4: API Usage Error (Output shape can not be computed for node [SCATTER]-[aten_ops.scatter.src]-[scatter_1_scatter_layer].)
2024-08-31 09:14:04.368 | INFO     | MainProcess | /usr/local/lib/python3.10/dist-packages/model_navigator/pipelines/pipeline.py:128 - backend='torch_tensorrt' raised:
ValueError: __len__() should return >= 0
While executing %eq : [num_users=1] = call_function[target=torch.ops.aten.eq.Scalar](args = (%scatter_1, -100), kwargs = {_itensor_to_tensor_meta: {<tensorrt.tensorrt.ITensor object at 0x7f44432187b0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f44488bb5b0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f44432119f0>: ((1, 1023), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f444329afb0>: ((1, 1023), torch.int64, False, (1023, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4443213fb0>: ((1, 1023), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f4e5f0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f80f30>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f8a9f0>: ((1,), torch.int64, False, (1024,), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f0a870>: ((1,), torch.int64, False, (1024,), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f0b7f0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f35c30>: ((1, 1), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f44430e90f0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {})}})
Original traceback:
None
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
2024-08-31 09:14:04.371 | WARNING  | MainProcess | /usr/local/lib/python3.10/dist-packages/model_navigator/pipelines/pipeline.py:131 - Command finished with ModelNavigatorUserInputError. The error is considered as external error. Usually caused by incompatibilities between the model and the target formats and/or runtimes. Please review the command output.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/commands/execution_context.py", line 156, in _execute_function
    fire.Fire(func, unwrapped_args)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/commands/correctness/correctness_script.py", line 93, in correctness
    comp_output = runner.infer(sample)
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/runners/base.py", line 325, in infer
    output = self.infer_impl(feed_dict, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/runners/torch.py", line 94, in infer_impl
    outputs = self._infer(feed_dict=feed_dict)
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/runners/torch.py", line 135, in _infer_v1
    outputs = self._loaded_model(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1714, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1725, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 434, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1714, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1725, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 1121, in __call__
    return self._torchdynamo_orig_callable(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 948, in __call__
    result = self._inner_convert(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 472, in __call__
    return _compile(
  File "/usr/local/lib/python3.10/dist-packages/torch/_utils_internal.py", line 85, in wrapper_function
    return StrobelightCompileTimeProfiler.profile_compile_time(
  File "/usr/local/lib/python3.10/dist-packages/torch/_strobelight/compile_time_profiler.py", line 129, in profile_compile_time
    return func(*args, **kwargs)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 817, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 233, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 636, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 1270, in transform_code_object
    transformations(instructions, code_options)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 178, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 582, in transform
    tracer.run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2476, in run
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 904, in run
    while self.step():
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 816, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2667, in RETURN_VALUE
    self._return(inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2652, in _return
    self.output.compile_subgraph(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1127, in compile_subgraph
    self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1324, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 233, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1415, in call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1396, in call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_dynamo.py", line 129, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 2223, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/backend/backends.py", line 44, in torch_tensorrt_backend
    return DEFAULT_BACKEND(gm, sample_inputs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/backend/backends.py", line 52, in aot_torch_tensorrt_aten_backend
    return _pretraced_backend(gm, sample_inputs, settings)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/backend/backends.py", line 108, in _pretraced_backend
    trt_compiled = compile_module(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/_compiler.py", line 431, in compile_module
    trt_module = convert_module(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 107, in convert_module
    interpreter_result = interpret_module_to_result(module, inputs, settings)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 88, in interpret_module_to_result
    interpreter_result = interpreter.run()
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 336, in run
    self._construct_trt_network_def()
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line [317](https://gitlab-master.nvidia.com/dl/jet/ci/-/jobs/109186759#L317), in _construct_trt_network_def
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/interpreter.py", line 146, in run
    self.env[node] = self.run_node(node)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 378, in run_node
    trt_node: torch.fx.Node = super().run_node(n)
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/interpreter.py", line 203, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 493, in call_function
    return converter(self.ctx, target, args, kwargs, self._cur_node_name)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/converter_utils.py", line 529, in convert_with_type_enforcement
    return func(ctx, target, new_args, new_kwargs, name)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/aten_ops_converters.py", line 2[319](https://gitlab-master.nvidia.com/dl/jet/ci/-/jobs/109186759#L319), in aten_ops_eq
    return impl.elementwise.eq(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/impl/elementwise/ops.py", line 674, in eq
    return convert_binary_elementwise(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/impl/elementwise/base.py", line 158, in convert_binary_elementwise
    lhs_val, rhs_val = broadcast_to_same_shape(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/converter_utils.py", line 249, in broadcast_to_same_shape
    lhs_val, rhs_val = broadcast(ctx, lhs_val, rhs_val, f"{name}_lhs", f"{name}_rhs")
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/converter_utils.py", line 785, in broadcast
    a_shape = tuple(a.shape)
torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
ValueError: __len__() should return >= 0
While executing %eq : [num_users=1] = call_function[target=torch.ops.aten.eq.Scalar](args = (%scatter_1, -100), kwargs = {_itensor_to_tensor_meta: {<tensorrt.tensorrt.ITensor object at 0x7f444[321](https://gitlab-master.nvidia.com/dl/jet/ci/-/jobs/109186759#L321)87b0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f44488bb5b0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f44432119f0>: ((1, 1023), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f444329afb0>: ((1, 1023), torch.int64, False, (1023, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4443213fb0>: ((1, 1023), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f4e5f0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f80f30>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f8a9f0>: ((1,), torch.int64, False, (1024,), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f0a870>: ((1,), torch.int64, False, (1024,), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f0b7f0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f4442f35c30>: ((1, 1), torch.int64, False, (1024, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0x7f44430e90f0>: ((1, 1024), torch.int64, False, (1024, 1), torch.contiguous_format, False, {})}})
Original traceback:
None
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/pipelines/pipeline.py", line 121, in _execute_unit
    command_output = execution_unit.command().run(
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/commands/base.py", line 127, in run
    output = self._run(*args, **_filter_dict_for_func(kwargs, self._run))
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/commands/correctness/correctness.py", line 150, in _run
    context.execute_python_script(
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/commands/execution_context.py", line 142, in execute_python_script
    self._execute_function(func, unwrapped_args, allow_failure, cmd)
  File "/usr/local/lib/python3.10/dist-packages/model_navigator/commands/execution_context.py", line 168, in _execute_function
    raise ModelNavigatorUserInputError(cmd_to_reproduce_error) from e
model_navigator.exceptions.ModelNavigatorUserInputError: Command to reproduce error: /bin/bash torch/reproduce_correctness-torchtensorrtcompilerunner.sh
transformers.models.bart.modeling_bart.BartForConditionalGeneration: Validating 
model torch on TorchTensorRTCompile backend FAIL
2024-08-31 09:14:04.373 | INFO     | MainProcess | /usr/local/lib/python3.10/dist-packages/model_navigator/pipelines/pipeline.py:148 - Execution time: 12.18[s]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant