-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable SIMD with Base.literal_pow #332
Conversation
Once again, great stuff! Could use a rebase now that #331 is merged. We can add tests for this to the other SIMD tests in |
Perhaps go up to 4. Not that uncommon with power of 4 in physical formulas. |
@jrevels Thanks for reviewing & merging #331! I rebased this PR and (I think) fixed the code for new @KristofferC Yeah, it would be interesting to see if adding higher order implementation helps. I stopped at 3 because Julia Base itself only has it only up to 3 (and my benchmark shows it is better than no bound): I wonder why they chose 3 and what benchmark they used. |
Oh, I'll have a look at |
2e087e5
to
8550e49
Compare
So making the SIMD test work in Travis was a bit tricky since it requires StaticArrays and Pkg3 does not add the dependencies including StaticArrays to the global environment: https://travis-ci.org/JuliaDiff/ForwardDiff.jl/builds/406057434 I could manually add packages but I thought adding Project.toml and let Pkg3 handle everything would be the easiest solution. You are going to add Project.toml at some point anyway, right? |
Project.toml
Outdated
@@ -0,0 +1,12 @@ | |||
name = "ForwardDiff" | |||
uuid = "f6369f11-7733-5829-9624-2563aa707210" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fredrikekre Re: #334 (comment)
Thanks, I fixed uuid
in this PR.
For now, until there is a good automatic way of translating REQUIRE files, perhaps just This is missing test dependencies for example (and those are being tweaked at the moment). |
If it would pass tests (since we have JuliaLang/julia#26594 in v0.7), we could also just get rid of the separate If that doesn't work, we can then try |
About that... JuliaLang/julia#27049, JuliaLang/julia#27659. If you want it, I think you need to lobby for it. |
Aww I didn't realize JuliaLang/julia#27659 got closed :( |
The SIMD test problem is already fixed by using @jrevels Do you have some reasons to avoid start using Pkg3 now? If so, I'll revert the relevant commits and close #333. If using Pkg3 is fine, I think this PR is ready to go. But note that I tried to keep changes related to FYI some (my understanding of) details for why adding |
The reason not to include a IIUC, the plan is for the Pkg3 team to automate the process of adding a |
I see. That sounds very reasonable. I updated the PR to use |
Awesome stuff, thanks again for all this work @tkf! |
This branch eliminates run-time branching (introduced in #331) and enables SIMD (previously impossible in master) for power functions with low degree (<= 3). For example,
executed with
julia -O3
prints twofmul <4 x double>
instructions (presumably one for.value
and one for.partials[1]
):In #331 branch and master, it was not possible to generate IR with SIMD instructions:
My first implementation was using
@generated
and it generatedliteral_pow
code for any degree. I thought it was OK sinceBase.literal_pow
has a fall-back implementation for higher degree power functions but it turned out this implantation was slower when the benchmark involves high degree power functions. Limiting degree to three yields better performance for all degrees I've tired. I get similar benchmark results in two machines:Machine 1 (a laptop)
@generated
-based vs Special case x^0 #331: https://gist.github.com/b44e7cfc85e682fa293182eae086ac5ffor
-@eval
-based vs Special case x^0 #331: https://gist.github.com/2283e2397df7dd604147cf9077190084Machine 2 (a desktop)
@generated
-based vs Special case x^0 #331: https://gist.github.com/tkf/0350bc6f30405b489c41b697d3692b88for
-@eval
-based vs Special case x^0 #331: https://gist.github.com/3834bdee95980c7a9268cc7e57b9f495