Document `@simd` and remove invalid uses #27495

mbauman · 2018-06-08T15:52:59Z

Copies the information from the performance tips into a docstring; makes it a bit scarier.

Fixes #27482. Copies the information from the performance tips into a docstring; makes it a bit scarier.

fredrikekre · 2018-06-08T17:16:04Z

base/simdloop.jl

+    This feature is experimental and could change or disappear in future versions of Julia.
+    Incorrect use of the `@simd` macro may cause unexpected results.
+
+By using `@simd`, you are are asserting several properties of the loop:


mbauman · 2018-06-08T17:21:07Z

I started with the HTML docs here, assuming they were in sync with the source. They weren't, so I'll have to go back and incorporate those changes.

Perhaps I should also just pare down the @simd section in the perf tips and just point to its docs?

mbauman · 2018-06-11T18:01:50Z

Ok, updated to use PerformanceTips.md as more of a pointer to @inbounds/@fastmath/@simd and a place for the combined usecases.

Any further comments?

miguelraz · 2018-06-11T18:33:17Z

doc/src/manual/performance-tips.md

+    like allowing floating-point re-associativity and ignoring dependent memory accesses. Again, be very
+    careful when asserting `@simd` as erroneously annotating a loop with dependent iterations may result
+    in unexpected results. In particular, note that `setindex!` on some `AbstractArray` subtypes is
+    enherently dependent upon iteration order. **This feature is experimental**


typo: enherently -> inherently

jebej · 2018-06-11T18:36:46Z

Will this slow down a lot of operations, e.g. sum?

KristofferC · 2018-06-11T18:42:52Z

Would be good to check Nanosoldier just to see the eventual damage.

mbauman · 2018-06-11T21:58:01Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2018-06-11T23:23:40Z

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

ararslan · 2018-06-11T23:49:09Z

Things should be marginally better now: @nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2018-06-12T06:56:01Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

mbauman · 2018-06-12T16:15:52Z

I've added a few workarounds for common cases. I know we'll still have some broadcast regressions but it's unavoidable — we simply cannot tell if it's safe to perform @simd over an arbitrary function. It's uncommon, but that function might just be mutating in a manner that violates @simd.

mbauman · 2018-06-12T16:16:04Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2018-06-13T02:17:53Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

mbauman · 2018-06-13T04:02:45Z

Oh interesting — looks like Nanosoldier has two different race conditions in what it compares. This is an obvious one both ways: this PR got the effects of #27331 but the "master" comparison didn't. At the same time it was triggered on a horribly broken commit here, but by the time the queue cleared it picked up the latest commit. In any case: we can still see the effects. Looks like I should add some @inlines to the inner loop functions in reductions. Beyond that, I think this is as good as we can get with the existing @simd semantics.

KristofferC · 2018-06-13T07:59:09Z

Oh interesting — looks like Nanosoldier has two different race conditions in what it compares.

I knew I wasn't crazy!

mbauman · 2018-06-13T15:18:28Z

This is ready to go by my account.

Appveyor failed a download with a 404, but I don't think the travis failures are related but I've not seen them before.

Mac failed with a LoadError: cannot obtain socket name: operation not permitted (EPERM) in SharedArrays and precompile; both linux builds failed in Sockets/test/runtests.jl:228 with Got exception UDP send failed: address not available (EADDRNOTAVAIL) (edit: the linux failures were seen previously in #27376 (comment)).

vchuravy

I am not sure that the notion of simdable function is viable. I agree that we are currently a bit to aggressive, but I do not think the problem is unpure functions.

From my understanding the problem is arrays that alias each other, since that could lead results that differ between a simd loop and a non-simd loop.

vchuravy · 2018-06-13T21:18:48Z

base/reduce.jl

+# The @simd transformation is only valid in limited situations
+struct SIMDableFunction{f}; end
+(::SIMDableFunction{f})(args...) where {f} = f(args...)
+simdable(f::Union{map(typeof, (+, *, &, |, add_sum, mul_prod, -, /, ^, identity))...}) = SIMDableFunction{f}()


I am not sure that sure that this is meaningful. The question is when is a function not valid to be executed as simd?
When it has side-effects? @simd is an advise to LLVM that gives it permissions to perform some limited transformations and encourages it to be more lenient, but LLVM still has to proof that the simd transformation is valid and beneficial.

Yes, this is a poor name. I don't really want to be building a trait-system here — I wanted to just use a hard-coded Union{} — but due to bootstrapping issues not all the functions I wanted to whitelist were available. Of course we can not guarantee that these functions won't have an invalid method on them.

Really, the problem is that @simd is non-local. It affects any function that gets inlined — even passed user functions. We have no idea if the particular method is going to have observable side-effects that break the SIMD guarantees. This is an attempt at a white-list to recoup some of the performance in common cases — but perhaps we shouldn't even try to do that.

LLVM still has to proof that the simd transformation is valid and beneficial

My understanding is that @simd is promising that the transformation is valid — allowing LLVM to do things that it cannot prove.

allowing LLVM to do things that it cannot prove.

Essentially true, but the list above is largely redundant with things that LLVM should be able to prove (since it's approximately just a subset list of known pure functions), and, beyond that, is incorrect in general (there's no guarantee – or expectation – that my implementation of any of those functions is actually pure).

My understanding is that @simd is promising that the transformation is valid

It defines that certain transforms are valid, but we implement that list. Currently it contains:

primitive ops that contribute towards the computation of a floating point computation reduction may be reordered (reassociated / commuted)

all memory writes are independent of all other memory writes and reads that are performed inside the loop (e.g. all memory that is read from is readonly, and no location is written to twice)

Ah, great, that is extremely helpful. So if we disable the memory independence assertion (b68efd3), then is it correct that one safely could annotate @simd over unknown generic functions? My read is yes: this simply adds extra handling for reduction variables — which, by definition, must be lexically visible to the @simd author.

Memory dependencies, on the other hand, might not be lexically visible to the @simd author. Doing something like #19658 could perhaps allow us to recoup some of the lost perf in the future in a much safer and precise way.

That's my understanding too. Although there's some non-locality in the definition of what constitutes "computation of the reduction", since we just examine the use/def graph after optimization. It is possible to make a closure which obscures the reduction variable lexically. For example:

result = Ref(0.0) for i = 1:10 result[] += i end

result = Ref(0.0) update(i) = (result[] += i) for i = 1:10 update(i) end

This pattern appears, for example, in the Stateful iterator.

mbauman · 2018-06-13T22:13:20Z

I'd be okay dropping the last two commits here if we want to err more conservatively.

It's really sad, but I think we should also seriously consider deleting @simd entirely. It's just a huge trap that we've explicitly encouraged many folks to fall into.

vchuravy · 2018-06-13T22:32:06Z

It's really sad, but I think we should also seriously consider deleting @simd entirely. It's just a huge trap that we've explicitly encouraged many folks to fall into.

The more I think about it the more I agree with this... I suspect that we need something that does a scope local annotation of being able to reorder operations and a annotation on memory loads for them to be allowed in parallel and non-aliasing. But the current form is not composable.

mbauman · 2018-06-14T18:35:26Z

Let's merge this for now and we can begin a broader discussion about the future of SIMD in a new issue.

vtjnash · 2018-06-14T19:13:52Z

Maybe we should redo the nanosoldier run, so we can get a real picture of the results.

We could also instead merge b68efd3#commitcomment-29355867, which just completely removes the offending optimization – and appears to affect fewer tests

mbauman · 2018-06-14T20:06:57Z

@nanosoldier runbenchmarks("array" || "problem" || "random", vs=":master")

nanosoldier · 2018-06-15T00:17:24Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

mbauman · 2018-06-15T22:15:30Z

Even if we do merge b68efd3#commitcomment-29355867, I think this patch is still needed due to the non-local effects of @simd on passed user functions. That's precisely why we see the broadcast fusion regressions from Nanosoldier — those tests are broadcasting user functions, which we don't know if they're safe for re-associativity.

mbauman · 2018-06-15T22:20:08Z

Failures all look unrelated: Sockets and SharedArrays failures with travis (UDP send failed: address not available (EADDRNOTAVAIL) and cannot obtain socket name: operation not permitted (EPERM)), AV hit a 404 with DownloadFile.

vtjnash · 2018-06-18T20:25:21Z

base/simdloop.jl

+      currently needed for all array accesses. The compiler can sometimes turn
+      short `&&`, `||`, and `?:` expressions into straight-line code if it is safe
+      to evaluate all operands unconditionally. Consider using the [`ifelse`](@ref)
+      function instead of `?:` in the loop if it is safe to do so.


In general, this would be bad advice, since it's harder to generate fast code for ifelse than for ?: (it's generally easy for the compiler to optimize ?: into ifelse if it would be profitable, but the reverse is not true). The better expression (often) is to say to avoid calling functions in the branches, and instead just select between two values. (also essentially equivalent to defining this as ifelse(x, y, z) = (x ? y : z), although we wouldn't currently infer this definition any better than the existing ifelse, because we don't do IPO conditional-value propagation)

mbauman · 2018-06-20T19:44:23Z

Superseded by #27670

Document @simd and remove invalid uses

2c4f59a

Fixes #27482. Copies the information from the performance tips into a docstring; makes it a bit scarier.

mbauman added the compiler:simd instruction-level vectorization label Jun 8, 2018

fredrikekre reviewed Jun 8, 2018

View reviewed changes

Deduplicate & link docs; add test

dd88c6a

miguelraz reviewed Jun 11, 2018

View reviewed changes

mbauman added 2 commits June 13, 2018 01:40

Try working around the performance hit in common cases

bff1c48

inline inner loop functions and add abs2 as a simdable function

c0e785c

mbauman force-pushed the mb/simd branch from 9fe05a6 to c0e785c Compare June 13, 2018 05:40

KristofferC mentioned this pull request Jun 13, 2018

Improve support for ranges with nonstandard Integers #27302

Merged

vchuravy reviewed Jun 13, 2018

View reviewed changes

vchuravy requested review from vtjnash and Keno June 13, 2018 21:23

mbauman added the triage This should be discussed on a triage call label Jun 13, 2018

StefanKarpinski removed the triage This should be discussed on a triage call label Jun 14, 2018

StefanKarpinski added this to the 0.7 milestone Jun 14, 2018

vtjnash reviewed Jun 18, 2018

View reviewed changes

vchuravy mentioned this pull request Jun 19, 2018

@simd breaks generic code #27670

Merged

mbauman added a commit that referenced this pull request Jun 20, 2018

Docs and tests extracted from #27495

b912204

mbauman closed this Jun 20, 2018

mbauman deleted the mb/simd branch June 20, 2018 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document `@simd` and remove invalid uses #27495

Document `@simd` and remove invalid uses #27495

mbauman commented Jun 8, 2018

fredrikekre Jun 8, 2018

mbauman commented Jun 8, 2018

mbauman commented Jun 11, 2018

miguelraz Jun 11, 2018

jebej commented Jun 11, 2018

KristofferC commented Jun 11, 2018 •

edited

Loading

mbauman commented Jun 11, 2018

nanosoldier commented Jun 11, 2018

ararslan commented Jun 11, 2018

nanosoldier commented Jun 12, 2018

mbauman commented Jun 12, 2018

mbauman commented Jun 12, 2018

nanosoldier commented Jun 13, 2018

mbauman commented Jun 13, 2018

KristofferC commented Jun 13, 2018

mbauman commented Jun 13, 2018 •

edited

Loading

vchuravy left a comment

vchuravy Jun 13, 2018

mbauman Jun 13, 2018

vtjnash Jun 18, 2018

vtjnash Jun 18, 2018

mbauman Jun 18, 2018

vtjnash Jun 18, 2018

mbauman commented Jun 13, 2018

vchuravy commented Jun 13, 2018

mbauman commented Jun 14, 2018

vtjnash commented Jun 14, 2018

mbauman commented Jun 14, 2018

nanosoldier commented Jun 15, 2018

mbauman commented Jun 15, 2018

mbauman commented Jun 15, 2018

vtjnash Jun 18, 2018

mbauman commented Jun 20, 2018

Document @simd and remove invalid uses #27495

Document @simd and remove invalid uses #27495

Conversation

mbauman commented Jun 8, 2018

Choose a reason for hiding this comment

mbauman commented Jun 8, 2018

mbauman commented Jun 11, 2018

Choose a reason for hiding this comment

jebej commented Jun 11, 2018

KristofferC commented Jun 11, 2018 • edited Loading

mbauman commented Jun 11, 2018

nanosoldier commented Jun 11, 2018

ararslan commented Jun 11, 2018

nanosoldier commented Jun 12, 2018

mbauman commented Jun 12, 2018

mbauman commented Jun 12, 2018

nanosoldier commented Jun 13, 2018

mbauman commented Jun 13, 2018

KristofferC commented Jun 13, 2018

mbauman commented Jun 13, 2018 • edited Loading

vchuravy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbauman commented Jun 13, 2018

vchuravy commented Jun 13, 2018

mbauman commented Jun 14, 2018

vtjnash commented Jun 14, 2018

mbauman commented Jun 14, 2018

nanosoldier commented Jun 15, 2018

mbauman commented Jun 15, 2018

mbauman commented Jun 15, 2018

Choose a reason for hiding this comment

mbauman commented Jun 20, 2018

Document `@simd` and remove invalid uses #27495

Document `@simd` and remove invalid uses #27495

KristofferC commented Jun 11, 2018 •

edited

Loading

mbauman commented Jun 13, 2018 •

edited

Loading