-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #6599 #6605
Conversation
for a in ea | ||
if first # first "arg" is the function name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't strictly necessary, since the first arg is usually a TopNode. but this should be more correct
this inlining threshold seems to translates to approximately 8 expressions of low complexity, and about the same number of llvm instructions
return true | ||
symlim = div(8,occurences) | ||
if length(body.args) < symlim | ||
symlim *= 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @stevengj's analysis (e805fd6#commitcomment-6089161)
and some of my own, I think we were significantly overestimating the conversion factor. i think we are roughly 1-to-1 for the symbols counted by occurs_more and the number of arguments to the llvm intrinsics
the estimate here is that 12 symbols in a line is relatively short. the limit on the number of lines is intended mostly as an optimization to skip counting when it is unlikely to be worth inlining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'd like this part of the change to be a separate patch, to separate it from the new
bugfix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing this back will potentially limit inlining more than before the patch, since now this test also controls inlining of arguments into functions
I can confirm that my FFT performance seems fine with this patch. |
With this patch, #6437 is 32x faster 😄 for two dimensions. But it still does not get inlined in 3 dimensions, and hence is 100x slower than a hand-written loop. It would be great to get it inlined at least up to 4 dimensions (although eventually it just needs to be inlined, period). |
if first | ||
first = false | ||
isa(a,SymbolNode) || return false | ||
typ = (a::SymbolNode).typ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should use exprtype
and then look for Type{T}
, where T
is an immutable type.
The |
they are in separate commits, just in the same pull request (updated again) |
Jeff, can you review this. I don't know how to test it effectively.