Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warnings during compilation with MAX_SIMD 1 #94

Closed
xnox opened this issue Jun 25, 2020 · 7 comments
Closed

Warnings during compilation with MAX_SIMD 1 #94

xnox opened this issue Jun 25, 2020 · 7 comments

Comments

@xnox
Copy link

xnox commented Jun 25, 2020

Compiling for 32bit arm v6 target without ability for NEON support, emits warnings:

# using https://github.com/BLAKE3-team/BLAKE3/pull/91
$ make test_asm CC=arm-linux-gnueabi-gcc uname_m=aarch64
arm-linux-gnueabi-gcc -O3 -Wall -Wextra -std=c11 -pedantic -DBLAKE3_TESTING -fsanitize=address,undefined  blake3.c blake3_dispatch.c blake3_portable.c main.c blake3_neon.o -o blake3
In file included from /usr/arm-linux-gnueabi/include/string.h:495,
                 from blake3.c:3:
In function ‘memcpy’,
    inlined from ‘compress_parents_parallel’ at blake3.c:236:5,
    inlined from ‘compress_subtree_to_parent_node’ at blake3.c:348:9,
    inlined from ‘blake3_hasher_update.part.0’ at blake3.c:527:7:
/usr/arm-linux-gnueabi/include/bits/string_fortified.h:34:10: warning: ‘__builtin_memcpy’ forming offset [33, 64] is out of the bounds [0, 32] of object ‘out_array’ with type ‘uint8_t[32]’ {aka ‘unsigned char[32]’} [-Warray-bounds]
   34 |   return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blake3.c: In function ‘blake3_hasher_update.part.0’:
blake3.c:345:11: note: ‘out_array’ declared here
  345 |   uint8_t out_array[MAX_SIMD_DEGREE_OR_2 * BLAKE3_OUT_LEN / 2];
      |           ^~~~~~~~~
In file included from /usr/arm-linux-gnueabi/include/string.h:495,
                 from blake3.c:3:
In function ‘memcpy’,
    inlined from ‘compress_subtree_to_parent_node’ at blake3.c:349:5,
    inlined from ‘blake3_hasher_update.part.0’ at blake3.c:527:7:
/usr/arm-linux-gnueabi/include/bits/string_fortified.h:34:10: warning: ‘__builtin___memcpy_chk’ forming offset [33, 64] is out of the bounds [0, 32] of object ‘out_array’ with type ‘uint8_t[32]’ {aka ‘unsigned char[32]’} [-Warray-bounds]
   34 |   return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blake3.c: In function ‘blake3_hasher_update.part.0’:
blake3.c:345:11: note: ‘out_array’ declared here
  345 |   uint8_t out_array[MAX_SIMD_DEGREE_OR_2 * BLAKE3_OUT_LEN / 2];
      |           ^~~~~~~~~

Is the out_array missized in such circumstances?

@sneves
Copy link
Collaborator

sneves commented Jun 26, 2020

According to the error messages out_array is uint8_t[32]. But your PR #91 makes MAX_SIMD_DEGREE unconditionally 4 on ARM, so shouldn't MAX_SIMD_DEGREE_OR_2 be 4 instead of 2, and that array be uint8_t[64]?

Either way, this warning happens when MAX_SIMD_DEGREE is 1 or 2, and I believe it's a false positive. When MAX_SIMD_DEGREE is that low, the function is never called:

while (num_cvs > 2) {
    num_cvs =
        compress_parents_parallel(cv_array, num_cvs, key, flags, out_array);
    memcpy(cv_array, out_array, num_cvs * BLAKE3_OUT_LEN);
  }

Adding an assert assert(num_cvs <= MAX_SIMD_DEGREE_OR_2); before that loop seems to suffice to silence the warning.

@xnox
Copy link
Author

xnox commented Jun 26, 2020

IS_ARM is defined as ARM with hardware floating point v7, or 64-bit ARM v8. Becuase for it we may have optimized NEON code path.

Thus ARM with software floating point v6 and lower do not get IS_ARM definition, and only access portable implementation.

I'm not sure if I should rename IS_ARM to something better, like IS_SIMD_ARM.

@oconnor663
Copy link
Member

This is related to #55 I think.

If adding that assert silences the compiler warning, it might be nice to add it, even if technically adds some runtime cost?

@xnox
Copy link
Author

xnox commented Jun 28, 2020

It should be possible to be made into compile time assert, no? without any runtime impact? Or rewrite code such that compiler doesn't get triggered about it?

@sneves
Copy link
Collaborator

sneves commented Jun 29, 2020

Something like

#if defined(__GNUC__)
#define BLAKE3_ASSUME(cond) do { if(!(cond)) __builtin_unreachable(); } while(0)
#else
#define BLAKE3_ASSUME(cond) 
#endif

...

size_t num_cvs = blake3_compress_subtree_wide(...);
BLAKE3_ASSUME(num_cvs <= MAX_SIMD_DEGREE_OR_2);

will have no runtime overhead.

I've played around a bit with this issue, and what seems to confuse GCC is the recursion in blake3_compress_subtree_wide. Turning the recursion into a loop could conceivably unconfuse GCC, but I don't particularly feel like that's a worthy change.

@divinity76
Copy link
Contributor

divinity76 commented Oct 29, 2020

for people worried about assert() overhead, they should be able to compile blake3 with -NDEBUG anyway..

funfact:

in SQLite, the asserts are so numerous and are in such performance critical places, that the database engine runs about three times slower when asserts are enabled. Hence, the default (production) build of SQLite disables asserts.

  • SQLite performance is reduced by roughly 66% if compiled with assert() enabled

@oconnor663
Copy link
Member

I believe this is fixed by b8e2dda, but please reopen if I've missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants