-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reapply "[Codegen][GPU] Add range information to GPU dispatch IDs" (#19361) #19372
Conversation
a6ba1ba
to
780b2ab
Compare
def HAL_InterfaceWorkgroupIDOp : HAL_PureOp<"interface.workgroup.id", [ | ||
DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>, | ||
]> { | ||
class HAL_InterfaceWorkgroupOp<string mnemonic, list<Trait> traits = []> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice cleanup!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for HAL changes! may want @MaheshRavishankar or someone to take a peek at the codegen side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC this got reverted for a regression. If the regression is fixed LGTM!
780b2ab
to
dbba7fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could explain what fixed the regression that caused the initial revert? It would be good to include this in the PR description.
…ree-org#19361) This reverts commit cb5be1d. Compaled to the previous revision, this one works around a correctness bug in dataflow analysis that's being fixed by removing the analysis after SCF->CF. --- First, this patch implements InferIntRangeInterface for hal.interface.workgroup.{size,id,count} using a local upper_bound attribute. Then, it adds a -iree-codegen-gpu-propagate-dispatch-size-bounds pass that adds these upper_bounds identifiers to the interface.workgroup operations and to gpu.thread_id based on static information available late in the codegen pipeline. Then, it uses -optimize-int-arithmetic to optimize indexing after -lower-affine, getting rid of a bunch of "if the input's negative" logic that isn't actually needed in many of our kernels. It also ensures that these upper_bound values propagate to LLVM.
dbba7fa
to
7edb543
Compare
This reverts commit cb5be1d.
Compaled to the previous revision, this one works around a correctness
bug in dataflow analysis that's being fixed by removing the analysis
after SCF->CF.
First, this patch implements InferIntRangeInterface for hal.interface.workgroup.{size,id,count} using a local upper_bound attribute.
Then, it adds a -iree-codegen-gpu-propagate-dispatch-size-bounds pass that adds these upper_bounds identifiers to the interface.workgroup operations and to gpu.thread_id based on static information available late in the codegen pipeline.
Then, it uses -optimize-int-arithmetic to optimize indexing after -lower-affine, getting rid of a bunch of "if the input's negative" logic that isn't actually needed in many of our kernels.
It also ensures that these upper_bound values propagate to LLVM.