Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IREE run module crash due to addition of MFMA op. https://github.com/iree-org/iree/pull/17921 #17970

Closed
pashu123 opened this issue Jul 19, 2024 · 1 comment
Labels
bug 🐞 Something isn't working

Comments

@pashu123
Copy link
Contributor

What happened?

iree-run-module --module=test_rocm.vmfb --device=hip  --input=2x130x130x4xf16=1.0 --input=3x3x4x320xf16=1.0 --input=320xf32=1.0

iree/runtime/src/iree/hal/drivers/hip/native_executable.c:186: INTERNAL; HIP driver error 'hipErrorNoBinaryForGpu' (209): no kernel image is available for execution on the device; mismatched target chip? missing/wrong bitcode directory?; while invoking native function hal.executable.create; while calling import;
[ 1]   native hal.executable.create:0 -
[ 0] bytecode module.__init:306 test.mlir:13:11
      at test.mlir:1:1; creating VM context; creating run context

Steps to reproduce your issue

Cherry-pick the commit from here: https://github.com/iree-org/iree/pull/17921/commits

Repro MLIR:

func.func @test(%5: tensor<2x130x130x4xf16>, %6: tensor<3x3x4x320xf16>, %7: tensor<320xf32>) -> ( tensor<2x128x128x320xf16>, tensor<2x320x128x128xf16> )  {
  %cst = arith.constant 0.000000e+00 : f32
  %c286016 = arith.constant 286016 : index
  %c22976 = arith.constant 22976 : index
  %c21696 = arith.constant 21696 : index
  %c556416 = arith.constant 556416 : index
  %c21527936 = arith.constant 21527936 : index
  %8 = tensor.empty() : tensor<2x320x128x128xf16>
  %9 = tensor.empty() : tensor<2x128x128x320xf16>
  %10 = tensor.empty() : tensor<2x128x128x320xf32>
  %11 = linalg.fill  ins(%cst : f32) outs(%10 : tensor<2x128x128x320xf32>) -> tensor<2x128x128x320xf32>
  %12 = linalg.conv_2d_nhwc_hwcf {dilations = dense<1> : vector<2xi64>, strides = dense<1> : vector<2xi64>} ins(%5, %6 : tensor<2x130x130x4xf16>, tensor<3x3x4x320xf16>) outs(%11 : tensor<2x128x128x320xf32>) -> tensor<2x128x128x320xf32>
  %13:2 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d3, d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%12, %7 : tensor<2x128x128x320xf32>, tensor<320xf32>) outs(%9, %8 : tensor<2x128x128x320xf16>, tensor<2x320x128x128xf16>)  {
  ^bb0(%in: f32, %in_0: f32, %out: f16, %out_1: f16):
    %14 = arith.addf %in, %in_0 : f32
    %15 = arith.truncf %14 : f32 to f16
    linalg.yield %15, %15 : f16, f16
  } -> (tensor<2x128x128x320xf16>, tensor<2x320x128x128xf16>)
  return %13#0, %13#1 : tensor<2x128x128x320xf16>, tensor<2x320x128x128xf16>
}

Compile command:

iree-compile test.mlir --iree-hal-target-backends=rocm --iree-rocm-target-chip=gfx942 --iree-opt-const-eval=false --iree-global-opt-propagate-transposes=true --iree-global-opt-enable-fuse-horizontal-contractions=true --iree-flow-enable-aggressive-fusion=true --iree-opt-aggressively-propagate-transposes=true --iree-opt-outer-dim-concat=true --iree-vm-target-truncate-unsupported-floats --iree-llvmgpu-enable-prefetch=true --iree-opt-data-tiling=false --iree-codegen-gpu-native-math-precision=true --iree-codegen-llvmgpu-use-vector-distribution --iree-rocm-waves-per-eu=2 --iree-execution-model=async-external "--iree-preprocessing-pass-pipeline=builtin.module(iree-preprocessing-transpose-convolution-pipeline, util.func(iree-preprocessing-pad-to-intrinsics))" --iree-scheduling-dump-statistics-format=json --iree-scheduling-dump-statistics-file=compilation_info.json -o test_rocm.vmfb

Run command:

iree-run-module --module=test_rocm.vmfb --device=hip --input=2x130x130x4xf16=1.0 --input=3x3x4x320xf16=1.0 --input=320xf32=1.0

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

@pashu123 pashu123 added the bug 🐞 Something isn't working label Jul 19, 2024
@pashu123
Copy link
Contributor Author

Added the fix here: #17921

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant