-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing executable linking when other targets are present. #19035
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
benvanik
added
bug 🐞
Something isn't working
compiler/dialects
Relating to the IREE compiler dialects (flow, hal, vm)
labels
Nov 5, 2024
benvanik
changed the title
Fixing VMVX linking when other targets are present.
[WIP] Fixing VMVX linking when other targets are present.
Nov 5, 2024
benvanik
force-pushed
the
users/benvanik/vmvx-linking-fix
branch
from
November 5, 2024 23:12
f84c197
to
5843c96
Compare
benvanik
changed the title
[WIP] Fixing VMVX linking when other targets are present.
[WIP] Fixing executable linking when other targets are present.
Nov 5, 2024
benvanik
changed the title
[WIP] Fixing executable linking when other targets are present.
Fixing executable linking when other targets are present.
Nov 5, 2024
benvanik
force-pushed
the
users/benvanik/vmvx-linking-fix
branch
from
November 6, 2024 00:24
5843c96
to
1b742fd
Compare
There is a typo in the example. The comma is in a wrong place.
It should be
|
benvanik
requested review from
antiagainst,
MaheshRavishankar,
kuhar,
qedawkins and
Groverkss
as code owners
November 6, 2024 00:44
benvanik
removed request for
antiagainst,
MaheshRavishankar,
kuhar,
qedawkins and
Groverkss
November 6, 2024 00:44
hanhanW
approved these changes
Nov 6, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yay!
kuhar
reviewed
Nov 6, 2024
compiler/src/iree/compiler/Codegen/SPIRV/SPIRVLinkExecutables.cpp
Outdated
Show resolved
Hide resolved
Groverkss
pushed a commit
to Groverkss/iree
that referenced
this pull request
Dec 1, 2024
…9035) The existing linking code was all kinds of wrong when multiple executables with disjoint entry points were present. Linking needs to be reworked in general but this incremental change ensures that targets only link executables that contain variants that use them. There are definitely still corner cases that don't work. This is a small step towards towards heterogeneous devices. The following example now compiles and runs: ```mlir #executable_target_embedded_elf_x86_64_ = #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {target_triple = "x86_64-none-elf", native_vector_size = 4 : index}> #executable_target_embedded_elf_x86_64_whatever = #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {target_triple = "x86_64-none-elf", native_vector_size = 16 : index}> #executable_target_vmvx_bytecode_fb = #hal.executable.target<"vmvx", "vmvx-bytecode-fb"> util.global private @device_a = #hal.device.target<"local", {ordinal = 0 : index}, [ #executable_target_embedded_elf_x86_64_, #executable_target_embedded_elf_x86_64_whatever ]> : !hal.device util.global private @device_b = #hal.device.target<"local", {ordinal = 1 : index}, [ #executable_target_vmvx_bytecode_fb ]> : !hal.device func.func public @mutli_device_mul_add( // Input argument is resident on device_a (tooling default to first device). %input_a: tensor<4xf32> {iree.abi.affinity = #hal.device.affinity<@device_a>} ) -> ( // Output result is expected to be on device_a (though not required). tensor<4xf32> {iree.abi.affinity = #hal.device.affinity<@device_a>} ) { // Compute on device_a (input is there). %constant_a = arith.constant dense<[0.0, 1.0, 2.0, 3.0]> : tensor<4xf32> %transient_a = arith.mulf %input_a, %constant_a : tensor<4xf32> // Transfer the result from device_a -> device_b. %transient_b = flow.tensor.transfer %transient_a : tensor<4xf32> to #hal.device.affinity<@device_b> // Compute on device_b. %constant_b = arith.constant dense<[4.0, 5.0, 6.0, 7.0]> : tensor<4xf32> %result_b = arith.mulf %transient_b, %constant_b : tensor<4xf32> // Transfer the result from device_b -> device_a. %result_a = flow.tensor.transfer %result_b : tensor<4xf32> to #hal.device.affinity<@device_a> // More compute on device_a - should produce into the result buffer. %result_a2 = arith.addf %result_a, %constant_a : tensor<4xf32> // Return the result on device_a (as required by ABI attr). func.return %result_a2 : tensor<4xf32> } ``` ```sh $ iree-compile --iree-execution-model=async-external iree-run-module-multi.mlir -o module.vmfb $ iree-run-module \ --module=module.vmfb --function=mutli_device_mul_add --input=4xf32=10,11,12,13 \ --device=local-task --device=local-task --task_topology_group_count=1 ``` (testing this in-tree is hard right now due to ergonomics issues - this is all experimental anyway)
giacs-epic
pushed a commit
to giacs-epic/iree
that referenced
this pull request
Dec 4, 2024
…9035) The existing linking code was all kinds of wrong when multiple executables with disjoint entry points were present. Linking needs to be reworked in general but this incremental change ensures that targets only link executables that contain variants that use them. There are definitely still corner cases that don't work. This is a small step towards towards heterogeneous devices. The following example now compiles and runs: ```mlir #executable_target_embedded_elf_x86_64_ = #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {target_triple = "x86_64-none-elf", native_vector_size = 4 : index}> #executable_target_embedded_elf_x86_64_whatever = #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {target_triple = "x86_64-none-elf", native_vector_size = 16 : index}> #executable_target_vmvx_bytecode_fb = #hal.executable.target<"vmvx", "vmvx-bytecode-fb"> util.global private @device_a = #hal.device.target<"local", {ordinal = 0 : index}, [ #executable_target_embedded_elf_x86_64_, #executable_target_embedded_elf_x86_64_whatever ]> : !hal.device util.global private @device_b = #hal.device.target<"local", {ordinal = 1 : index}, [ #executable_target_vmvx_bytecode_fb ]> : !hal.device func.func public @mutli_device_mul_add( // Input argument is resident on device_a (tooling default to first device). %input_a: tensor<4xf32> {iree.abi.affinity = #hal.device.affinity<@device_a>} ) -> ( // Output result is expected to be on device_a (though not required). tensor<4xf32> {iree.abi.affinity = #hal.device.affinity<@device_a>} ) { // Compute on device_a (input is there). %constant_a = arith.constant dense<[0.0, 1.0, 2.0, 3.0]> : tensor<4xf32> %transient_a = arith.mulf %input_a, %constant_a : tensor<4xf32> // Transfer the result from device_a -> device_b. %transient_b = flow.tensor.transfer %transient_a : tensor<4xf32> to #hal.device.affinity<@device_b> // Compute on device_b. %constant_b = arith.constant dense<[4.0, 5.0, 6.0, 7.0]> : tensor<4xf32> %result_b = arith.mulf %transient_b, %constant_b : tensor<4xf32> // Transfer the result from device_b -> device_a. %result_a = flow.tensor.transfer %result_b : tensor<4xf32> to #hal.device.affinity<@device_a> // More compute on device_a - should produce into the result buffer. %result_a2 = arith.addf %result_a, %constant_a : tensor<4xf32> // Return the result on device_a (as required by ABI attr). func.return %result_a2 : tensor<4xf32> } ``` ```sh $ iree-compile --iree-execution-model=async-external iree-run-module-multi.mlir -o module.vmfb $ iree-run-module \ --module=module.vmfb --function=mutli_device_mul_add --input=4xf32=10,11,12,13 \ --device=local-task --device=local-task --task_topology_group_count=1 ``` (testing this in-tree is hard right now due to ergonomics issues - this is all experimental anyway) Signed-off-by: Giacomo Serafini <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug 🐞
Something isn't working
compiler/dialects
Relating to the IREE compiler dialects (flow, hal, vm)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The existing linking code was all kinds of wrong when multiple executables with disjoint entry points were present. Linking needs to be reworked in general but this incremental change ensures that targets only link executables that contain variants that use them. There are definitely still corner cases that don't work.
This is a small step towards towards heterogeneous devices. The following example now compiles and runs:
(testing this in-tree is hard right now due to ergonomics issues - this is all experimental anyway)