-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not pass -m
flags when compiling shuffle.c
#622
Conversation
Thanks @mgorny for this. However, your pull request will not use AVX2 in machines having it (you can easily check that by putting a Unfortunately AVX2 can greatly accelerate shuffle/unshuffle as well as bitshuffle/bitunshuffle, and it would be bad if users cannot leverage this capability anymore. Also, @t20100 may have something to say here. |
Oh, I'm sorry about that. I see the problem now, the Lines 345 to 347 in dc5d651
FWICS top-level |
Do not pass `-msse2`, `-mavx2`, etc. flags to the compiler when compiling `shuffle.c`. From what I can see, the file itself does not use any of these intrinsics, and they are only used by functions declared in `bitshuffle-*.c` and `shuffle-*.c` (where the respective flags are still passed). This prevents the compiler from incidentally optimizing the code called independenlty of the runtime CPU check to these instruction sets, effectively causing `SIGILL` on other CPUs. I have verified that this fixes the issue on `-march=znver2`, but also does not cause any issues on `-march=x86-64` and `-march=i686`. Fixes Blosc#621
I've updated the pull request to fix the immediate issue. I can work on it further when I know what's the issue that prompted dd57c03. |
This commit (dd57c03) was to add support of macos On option proposed in Blosc/c-blosc#347 (Same as #431 but for c-blosc1):
With this |
Ok, the latest version of this PR is using AVX2 on my box, so that's great. @t20100 I am not 100% certain if you agree with this PR or if you foresee issues with |
As it is, I would expect c-blosc2/blosc/bitshuffle-avx512.c Line 24 in dc5d651
to lead to issues. |
Hope this helps to clarify: t20100@b243896 It's only done for AVX2 and would need to be done for all SIMD implementations. It's maybe not necessary to do it for both shuffle and bitshuffle since the marco tests are the same. It's basically adding a runtime test of the SIMD implementation availability and dummy functions when not available. |
Should I add these to this PR or are you going to take it from here? |
As you want. If it's OK, then you can either get my changes or I can make a PR to your branch to add it to this PR. |
@t20100 In general I like were you are headed (I just suggested to use |
Ok, I'll update the PR in a few minutes. |
FWICS all the existing functions have |
(same goes for |
Oh, I forgot about this. Yeah, as they are internal functions, I think renaming is a good idea. |
Let's do that here too. |
One more quick question: I see that some header files (but not all) declare additional functions such as |
Hmm, I'd say it is probably safe to remove those functions from the header, but I am not completely sure. Perhaps @kif can shed some light here? |
Thinking twice about this, let's focus in our problem at hand, and let the cleanup to happen in another PR. |
Well, unfortunately this brings the second question: since these functions are technically present in the header, should I also stub them with My rough idea for this PR would be to, in order:
|
I see. I'm +1 on your plan then. |
Actually, I was wrong. Some of these functions are used cross-file, e.g. AVX512 uses AVX2 and SSE2 functions. |
@t20100, actually, could you do the part with guards? I'm entirely unfamiliar with this code, and I'm getting too easily distracted for this. |
OK, I'll take care of the guards part. |
I made a PR on this branch with stubs when SIMD are not available and runtime availability checks: mgorny#1 What do you think? |
Add runtime checks for use of SIMD implementations and stubs functions if not available.
Thanks a lot! I've merged it here. |
Thank you! |
Do not pass
-msse2
,-mavx2
, etc. flags to the compiler when compilingshuffle.c
. From what I can see, the file itself does not use any of these intrinsics, and they are only used by functions declared inbitshuffle-*.c
andshuffle-*.c
(where the respective flags are still passed). This prevents the compiler from incidentally optimizing the code called independenlty of the runtime CPU check to these instruction sets, effectively causingSIGILL
on other CPUs.I have verified that this fixes the issue on
-march=znver2
, but also does not cause any issues on-march=x86-64
and-march=i686
.Fixes #621