-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU][DT] Switch to query MMA intrinsics from IREE::GPU::TargetAttr. #18241
[GPU][DT] Switch to query MMA intrinsics from IREE::GPU::TargetAttr. #18241
Conversation
The revision also moves the lit test to `Common/GPU/test/`. Signed-off-by: hanhanW <[email protected]>
Signed-off-by: hanhanW <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why it is marked as a new file. I ran git mv ...
when I created the change. There are no much changes in the file. The main difference is that it defines a target attribute and add it to func op's attribute.
It also pads checks with spaces, so the code can start on the same column.
// TODO(hanchung): Remove the wrapper after allowing the type converter to carry | ||
// the targetAttr. For now, follow what CPU is doing. | ||
static MaterializeEncodingFn | ||
getMaterializeEncodingFn(IREE::HAL::ExecutableTargetAttr targetAttr) { | ||
return | ||
[targetAttr]( | ||
RankedTensorType tensorType) -> FailureOr<MaterializeEncodingInfo> { | ||
return materializeEncodingForTarget(tensorType, targetAttr); | ||
}; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO will be killed after we land #18242
Signed-off-by: hanhanW <[email protected]>
#encoding = #iree_encoding.encoding<operand_index = 1, op_type = matmul, element_types = [f32, f32, f32], original_type = tensor<255x513xf32>, | ||
user_indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d2, d1)>, affine_map<(d0, d1, d2) -> (d0, d1)>], | ||
round_dims_to = array<i64: 16, 16, 16>> | ||
#executable_target_rocm_hsaco_fb = #hal.executable.target<"rocm", "rocm-hsaco-fb", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to embed this whole structure in test if you use --iree-gpu-test-target=cdna3
or something to have it "autopouplate". You'd need to use getGPUTargetAttr
utility function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little tricky in the data-tiling setup because the type converter only takes tensor type and executable target attribute. We want an unified API setup for all the backends. The IREE::GPU::TargetAttr is defined separately which makes it hard. Perhaps I can expose the getCLGPUTarget
method, so I can have better setup for the test. It is also needed in my other prototype. I'll fix it in a follow-up.
2b1310b
into
iree-org:shared/gpu-data-tiling-materialize-encoding
The revision also moves the lit test to
Common/GPU/test/
.