-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace checked_fptosi intrinsics with Julia implementation #14763
Conversation
Let's check =) |
@nanosoldier |
This has been updated to do an explicit range check, so should no longer utilise any undefined behaviour |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Hmm, the slowdown seems to be due to the fact that the length of |
c92f76e
to
fea8f26
Compare
throw(InexactError()) | ||
end | ||
end | ||
else # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was there going to be a comment here then changed your mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember, this was quite a long time ago (if only I had left a comment on what that comment was going to be...)
@nanosoldier |
@simonbyrne looks like you may have submitted the job with the wrong syntax, and then edited to the correct syntax? Nanosoldier ignores comment edits to prevent accidental job resubmission. Triggering it with a new comment should work: @nanosoldier |
thanks! |
Would this also work for Float16? |
It should do, the logic is pretty much the same. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
8aa8741
to
c9b115b
Compare
Actually, it won't work for |
c9b115b
to
88c66b4
Compare
Updated. It seems like the change in performance is a bit of a wash. We may want to look at speeding up Any objections to merging? cc @ViralBShah @vchuravy |
No objections to merging. I would still be interested in a implementation for Float16. Mostly because I am looking at improving our support for Float16 on platforms that support them natively. |
It should be possible via an extra branch: just check if |
88c66b4
to
bc7a951
Compare
that's a pretty bad slowdown in sparse indexing, worth looking into and profiling the difference |
I couldn't figure it out: there don't appear to be any conversion calls, and I can't recreate it locally. One more time: @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
The sparse issue seems to have mostly gone. |
Given the inconsistency, it seems that the regressions are mostly noise. We do get a nice speed boost on some though. Shall we merge this? |
Out of curiosity, how reliable is nanosoldier if +-15% can be mainly attributed to noise? |
bc7a951
to
f0355cb
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Okay, we'll have to figure out how to improve |
Will probably want to backport this to get tests to pass on ARM/Power in next release. |
|
Fixes #14549.
As @yuyichao points out, this will effect inlining rules.