Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EarlyCSE] Compare GEP instructions based on offset #65875

Merged
merged 9 commits into from
Sep 19, 2023

Conversation

DianQK
Copy link
Member

@DianQK DianQK commented Sep 10, 2023

Closes #65763.
This will provide more opportunities for constant propagation for subsequent optimizations.

@nikic
Copy link
Contributor

nikic commented Sep 10, 2023

Please check the compile-time impact of this change. The impact was high when I tried something like this before, but maybe I made a mistake.

@DianQK
Copy link
Member Author

DianQK commented Sep 11, 2023

Please check the compile-time impact of this change. The impact was high when I tried something like this before, but maybe I made a mistake.

Could you add https://github.com/DianQK/llvm-project to https://llvm-compile-time-tracker.com/? Thanks.

@nikic
Copy link
Contributor

nikic commented Sep 11, 2023

Please check the compile-time impact of this change. The impact was high when I tried something like this before, but maybe I made a mistake.

Could you add https://github.com/DianQK/llvm-project to https://llvm-compile-time-tracker.com/? Thanks.

I'll do this on Thursday. I'm currently on vacation and don't have the SSH key with me.

@nikic
Copy link
Contributor

nikic commented Sep 14, 2023

Please check the compile-time impact of this change. The impact was high when I tried something like this before, but maybe I made a mistake.

Could you add https://github.com/DianQK/llvm-project to https://llvm-compile-time-tracker.com/? Thanks.

I'll do this on Thursday. I'm currently on vacation and don't have the SSH key with me.

Done, results are here: http://llvm-compile-time-tracker.com/compare.php?from=ac1daad9bb4eb083df6b215c029816d3149e00d8&to=e079f3f9107f55e1595a47aafec5fe4e03277665&stat=instructions:u

llvm/lib/Transforms/Scalar/EarlyCSE.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Scalar/EarlyCSE.cpp Outdated Show resolved Hide resolved
@DianQK
Copy link
Member Author

DianQK commented Sep 14, 2023

Please check the compile-time impact of this change. The impact was high when I tried something like this before, but maybe I made a mistake.

Could you add https://github.com/DianQK/llvm-project to https://llvm-compile-time-tracker.com/? Thanks.

I'll do this on Thursday. I'm currently on vacation and don't have the SSH key with me.

Done, results are here: http://llvm-compile-time-tracker.com/compare.php?from=ac1daad9bb4eb083df6b215c029816d3149e00d8&to=e079f3f9107f55e1595a47aafec5fe4e03277665&stat=instructions:u

Sad result. My initial guess was that in GEP 1 1 .... 1 1 a did redundant calculations.
But DianQK@f3fdea4 got worse results.
My other guess is that iteration itself is expensive. I'll try again. If that still doesn't work, I'd look into valgrind related tools.

This will provide more opportunities for
constant propagation for subsequent optimizations.
@DianQK DianQK force-pushed the early-cse-offset-based-for-geps branch from 78ed2a0 to 31e2ec9 Compare September 16, 2023 09:31
@DianQK
Copy link
Member Author

DianQK commented Sep 16, 2023

It's not ready for review yet. It may contain many nits.

Progress update:
Since lookups require frequent computation of constant offsets. GEP should not be considered as a SimpleValue.
I got better results than the last commit. I'll analyze it next with specific source code.

https://llvm-compile-time-tracker.com/compare.php?from=ac1daad9bb4eb083df6b215c029816d3149e00d8&to=31e2ec9d89aeded9ea5da822262449b0c4e8ab16&stat=instructions:u

@DianQK DianQK marked this pull request as draft September 16, 2023 09:37
@DianQK
Copy link
Member Author

DianQK commented Sep 17, 2023

Using references works well!
https://llvm-compile-time-tracker.com/compare.php?from=ac1daad9bb4eb083df6b215c029816d3149e00d8&to=089349f9eb47c57cfa2049ce0af5c7dd079a8581&stat=instructions%3Au

I've also tried some approaches.
It should be useful to avoid offset calculations. I tried counting only with a GEP i8, but there is some extra cost to the determination of this one.
Simplifying the offset calculation is of some use, but I'm not sure it makes sense. Results are here https://llvm-compile-time-tracker.com/compare.php?from=ac1daad9bb4eb083df6b215c029816d3149e00d8&to=8ba478eb77c6dc09b87e9b511614e45e167be2b5&stat=instructions%3Au

@DianQK DianQK marked this pull request as ready for review September 17, 2023 08:19
@nikic
Copy link
Contributor

nikic commented Sep 17, 2023

Okay, those results look reasonable. It's somewhat unfortunate that this requires duplicating a bunch of code for the separate GEP handling. I'm wondering, would we be able to achieve similar results if we add a DenseMap<GetElementPtrInst *, APInt> cache for the offset but otherwise keep the previous code structure? Or would the extra hash lookup make this slow again?

@DianQK
Copy link
Member Author

DianQK commented Sep 17, 2023

I'm wondering, would we be able to achieve similar results if we add a DenseMap<GetElementPtrInst *, APInt> cache for the offset but otherwise keep the previous code structure? Or would the extra hash lookup make this slow again?

I feel like I misunderstood. Do you mean putting the GEP back into SimpleValue? And then create a global (or EarlyCSE only) cache value?

@nikic
Copy link
Contributor

nikic commented Sep 18, 2023

I'm wondering, would we be able to achieve similar results if we add a DenseMap<GetElementPtrInst *, APInt> cache for the offset but otherwise keep the previous code structure? Or would the extra hash lookup make this slow again?

I feel like I misunderstood. Do you mean putting the GEP back into SimpleValue? And then create a global (or EarlyCSE only) cache value?

Yes (EarlyCSE only cache). This is just a suggestion for something to check, but I suspect the performance of that approach will not be good enough and we have to go with this version.

llvm/lib/Transforms/Scalar/EarlyCSE.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Scalar/EarlyCSE.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Scalar/EarlyCSE.cpp Outdated Show resolved Hide resolved
@DianQK
Copy link
Member Author

DianQK commented Sep 19, 2023

I'm wondering, would we be able to achieve similar results if we add a DenseMap<GetElementPtrInst *, APInt> cache for the offset but otherwise keep the previous code structure? Or would the extra hash lookup make this slow again?

I feel like I misunderstood. Do you mean putting the GEP back into SimpleValue? And then create a global (or EarlyCSE only) cache value?

Yes (EarlyCSE only cache). This is just a suggestion for something to check, but I suspect the performance of that approach will not be good enough and we have to go with this version.

I did not try this approach. My understanding is that this only adds extra lookups if applied to a single EarlyCSE. Since each result needs to be calculated once. The current patch is the no-hash approach as I understand it.
This result may be invalid if shared in EarlyCSE.

I made a new attempt to save uint64_t instead of APInt.
This result looks pretty good. https://llvm-compile-time-tracker.com/compare.php?from=ac1daad9bb4eb083df6b215c029816d3149e00d8&to=82f1eca7c180b0dfe8af36e933670f938a49bab1&stat=instructions%3Au

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

llvm/lib/Transforms/Scalar/EarlyCSE.cpp Outdated Show resolved Hide resolved
@DianQK DianQK merged commit 2d1e8a0 into llvm:main Sep 19, 2023
@DianQK DianQK deleted the early-cse-offset-based-for-geps branch September 19, 2023 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LoopFullUnrollPass failed due to constant inference in opaque pointer mode.
3 participants