Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #6599 #6605

Merged
merged 3 commits into from
Apr 26, 2014
Merged

fix #6599 #6605

merged 3 commits into from
Apr 26, 2014

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Apr 23, 2014

Jeff, can you review this. I don't know how to test it effectively.

for a in ea
if first # first "arg" is the function name
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't strictly necessary, since the first arg is usually a TopNode. but this should be more correct

this inlining threshold seems to translates to approximately 8
expressions of low complexity, and about the same number of llvm instructions
return true
symlim = div(8,occurences)
if length(body.args) < symlim
symlim *= 12
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from @stevengj's analysis (e805fd6#commitcomment-6089161)

and some of my own, I think we were significantly overestimating the conversion factor. i think we are roughly 1-to-1 for the symbols counted by occurs_more and the number of arguments to the llvm intrinsics

the estimate here is that 12 symbols in a line is relatively short. the limit on the number of lines is intended mostly as an optimization to skip counting when it is unlikely to be worth inlining

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'd like this part of the change to be a separate patch, to separate it from the new bugfix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing this back will potentially limit inlining more than before the patch, since now this test also controls inlining of arguments into functions

@stevengj
Copy link
Member

I can confirm that my FFT performance seems fine with this patch.

@timholy
Copy link
Member

timholy commented Apr 23, 2014

With this patch, #6437 is 32x faster 😄 for two dimensions. But it still does not get inlined in 3 dimensions, and hence is 100x slower than a hand-written loop. It would be great to get it inlined at least up to 4 dimensions (although eventually it just needs to be inlined, period).

if first
first = false
isa(a,SymbolNode) || return false
typ = (a::SymbolNode).typ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should use exprtype and then look for Type{T}, where T is an immutable type.

@JeffBezanson
Copy link
Member

The new thing is strictly a bug fix, so it should really go in a separate commit. The inline_worthy change is fine too, it just needs to be separated.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 25, 2014

they are in separate commits, just in the same pull request (updated again)

@vtjnash vtjnash merged commit 7033cc5 into master Apr 26, 2014
@vtjnash vtjnash deleted the jn/6599 branch August 11, 2014 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants