Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce TritonStructured dialect and triton-to-structured pass #82

Merged
merged 10 commits into from
Jan 19, 2024

Conversation

haishanzzzz
Copy link
Contributor

@haishanzzzz haishanzzzz commented Jan 10, 2024

This PR introduces TritonStructured dialect and triton-to-structured pass. Please see #81 for background.

This PR is broken into 5 commits:

  1. Introduce TritonStructured dialect, which includes three ops: tts.make_tptr, tts.load, and tts.store
  2. Update the current MaskAnalysis, which removed unused functions that produces different flavors of subviews
  3. Minor updates to OpFoldResultsUtils
  4. Introduce triton-to-structured pass
  5. Adds LIT tests

A few notes:

  • triton-to-structured does not use DialectConversion but rather manually walks the IR to perform rewriting. The reason is this pass does not try to legalize Triton operations (in fact it allows them to live after the pass) but rather opportunistically rewrite certain ops if the analysis succeeds. Legality marking in DialectConversion does not provide a way to set certain instances of an op as legal, which makes it a poor fit for our purpose.
  • Compared to triton-to-linalg, triton-to-structured is more robust when encountering IRs that it cannot analyze. Specifically, it emitRemark when analysis fails, emitWarning when it may (although very unlikely) produce wrong results, and only fail when a logic error is detected. This graceful failure can be demonstrated by manually rerunning wraparound_unsupported_add_offset.mlir.
  • LIT tests are directly borrowed from the current tests for triton-to-linalg. Unrelated tests are removed and output is manually examined to verify correctness (except for those ones generated for Triton tutorials).
  • tt.make_tptr will only be used to handle tensor of pointers.
  • When pointer analysis fails, or when the original code contains scalar pointer accesses, tt.addptr will be left in the output of the pass. They should be converted to memref operations in structured-to-memref.
  • Analysis code is unfortunately duplicated for the time being. Please bear with the awkward names.

A main follow up to this PR is to get consensus on how we'd like to handle block pointers in this pass.

There are a few other misc. items on the wishlist for the new code:

  • Cache pointer and mask analysis results to avoid introducing duplicated ops
  • More testing on wrap around behavior
  • Pretty printer, verifier, canonicalizer for newly introduced ops
  • Investigate why tt.reduce make --remove-dead-values pass fail

@nhat-nguyen nhat-nguyen self-requested a review January 10, 2024 14:08
@nhat-nguyen
Copy link
Collaborator

Thank you @haishanzzzz, I'll start taking a look :)

@haishanzzzz
Copy link
Contributor Author

@haishanzzzz please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree company="Meta"

@manbearian
Copy link
Collaborator

@haishanzzzz please , can you tell me more about this:

Investigate why tt.reduce make --remove-dead-values pass fail

@haishanzzzz
Copy link
Contributor Author

@haishanzzzz please , can you tell me more about this:

Investigate why tt.reduce make --remove-dead-values pass fail

This is technically not related to this PR, but if I just do triton-opt --remove-dead-values with tt.reduce things actually fail.

triton-opt --remove-dead-values reducemax_32_256_bf16.mlir                                                                                                                                                                    16:57:35
reducemax_32_256_bf16.mlir:35:7: error: null operand found
      tt.reduce.return %22 : bf16
      ^
reducemax_32_256_bf16.mlir:35:7: note: see current operation: "tt.reduce.return"(<<NULL VALUE>>) : (<<NULL TYPE>>) -> ()

Is this what you see too?

@nhat-nguyen
Copy link
Collaborator

@haishanzzzz I reviewed the changes and left some comments, overall, I think the approach is good and makes the code much simpler. Love it.

@nhat-nguyen
Copy link
Collaborator

@haishanzzzz Would you mind updating the description with the decision around keeping tt.addptr for scalars after this pass for structured-to-memref to process later?

@haishanzzzz
Copy link
Contributor Author

@haishanzzzz Would you mind updating the description with the decision around keeping tt.addptr for scalars after this pass for structured-to-memref to process later?

Updated the description to include more info on relying on tt.addptr for scalar pointers, and that we should expect to see tt.addptr at the output of the pass if analysis fails or for scalar pointers.

@haishanzzzz
Copy link
Contributor Author

@nhat-nguyen @manbearian Please let me know if there are other things I can do before closing this PR.

Copy link
Collaborator

@nhat-nguyen nhat-nguyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haishanzzzz This looks good to me. Thank you for addressing all the comments. You just need to update this with the main branch, once done we will be able to land this.

@nhat-nguyen nhat-nguyen merged commit ff953a8 into microsoft:main Jan 19, 2024
2 checks passed
nhat-nguyen pushed a commit that referenced this pull request Feb 1, 2024
* Introduce TritonStructured dialect

* Updated mask analysis

* Update OpFoldResultUtils

* triton-to-structured pass

* LIT tests

* Address review comments

* Revert header name change

* Remove copied version of mask analysis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants