Skip to content

Commit

Permalink
Make Matrix implementation more SIMD friendly
Browse files Browse the repository at this point in the history
This is backporting some changes we made in OSL (with customized
versions of Imath headers that we can thankfully no longer need if we
push the fixes back to the Imath project).

It may be tempting to use memset or memcpy rather than doing certain
element-by-element copies. And indeed, in a scalar context they
generate the same code (with the optimizer on, at least). But when
those operations are done inside a loop that you hope will be
auto-vectorized, the casting and function calling involved in using
memset/memcpy will confuse the vectorizer and you'll end up with
inferior code, sometimes even not vectorizing the whole loop.
This was all pointed out by people from Intel working on OSL, so
I'm deferring to their judgment that this is the best solution.

Signed-off-by: Larry Gritz <[email protected]>
  • Loading branch information
lgritz committed Nov 13, 2020
1 parent 1fa1e00 commit 31accd2
Showing 1 changed file with 247 additions and 180 deletions.
Loading

0 comments on commit 31accd2

Please sign in to comment.