Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sve targets #2886

Closed
wants to merge 5 commits into from
Closed

Add sve targets #2886

wants to merge 5 commits into from

Conversation

vorj
Copy link
Contributor

@vorj vorj commented May 31, 2023

related: #2884

This PR contains below changes:

  • Add new optlevel sve
    • ARM SVE is extension of ARMv8, so it should be treated similar to AVX2 IMO
  • Add targets for ARM SVE, faiss_sve and swigfaiss_sve
    • These targets will be built when you give -DFAISS_OPT_LEVEL=sve at build time
    • Design decision: Don't fix SVE register length.
      • The python package of faiss is "fat binary" (for example, the package for avx2 contains _swigfaiss_avx2.so and _swigfaiss.so)
      • SVE is scalable instruction set (= doesn't fix vector length), but actually we can specify the vector length at compile time.
        • with -msve-vector-length= option
        • When this option is specified, the binary can't work correctly on the CPU which has other vector length rather than specified at compile time
      • When we use fixed vector length, SVE-supported faiss python package will contain 7 shared libraries like _swigfaiss.so , _swigfaiss_sve.so , _swigfaiss_sve128.so , _swigfaiss_sve256.so , _swigfaiss_sve512.so , _swigfaiss_sve1024.so , and _swigfaiss_sve2048.so . The package size will be exploded.
      • For these reason, I don't specify the vector length at compile time and faiss_sve detects the vector length at run time.
  • Add a mechanism of detecting ARM SVE on runtime environment and importing swigfaiss_sve dynamically
    • Currently it only supports Linux, but there is no SVE environment with non-Linux OS now, as far as I know

NOTE: I plan to make one more PR about add some SVE implementation after this PR merged. This PR only contains adding sve target.

@mdouze
Copy link
Contributor

mdouze commented May 31, 2023

Please don't add a faiss/python/swigfaiss_sve.swig file.

@vorj vorj force-pushed the support-arm_sve branch from 4def441 to a6d28e4 Compare May 31, 2023 19:47
@vorj
Copy link
Contributor Author

vorj commented May 31, 2023

Oh, sorry. I missed but that has been copied at this line. I removed the file and added the path on .gitignore .

@vorj vorj force-pushed the support-arm_sve branch 3 times, most recently from 091b0f7 to 5a3d8ea Compare June 2, 2023 23:46
@vorj vorj force-pushed the support-arm_sve branch 3 times, most recently from 96f35db to d7c27ba Compare June 13, 2023 04:11
@vorj vorj force-pushed the support-arm_sve branch from 0ecf934 to 59acf2b Compare June 20, 2023 06:11
@vorj
Copy link
Contributor Author

vorj commented Jun 20, 2023

environment: line 9: /opt/conda/lib/jvm/languages/python/bin/conda: No such file or directory

🤨

@vorj
Copy link
Contributor Author

vorj commented Jun 20, 2023

Ah, #2917, OK.

@vorj
Copy link
Contributor Author

vorj commented Jun 20, 2023

@mdouze How about the current status of this PR?

@mdouze
Copy link
Contributor

mdouze commented Jun 21, 2023

So the diff only changes the compilation flags, it does not add VSE specific SIMD implementations, right?
Do you have hardware to try it on and maybe measure performance improvements?

@vorj vorj force-pushed the support-arm_sve branch from 59acf2b to b0c2296 Compare June 21, 2023 10:48
@vorj
Copy link
Contributor Author

vorj commented Jun 21, 2023

So the diff only changes the compilation flags, it does not add VSE specific SIMD implementations, right?
Do you have hardware to try it on and maybe measure performance improvements?

In this PR faiss uses SVE only with auto vectorized functions like fvec_L2sqr.
This PR still has little performance improvements, but I aim this as to add faiss_sve target at first.

@vorj
Copy link
Contributor Author

vorj commented Jun 22, 2023

As I wrote before,

I plan to make one more PR about add some SVE implementation after this PR merged.

It will include SVE implmemtations of code_distance , exhaustive_L2sqr_blas_cmax , and so on.

@vorj vorj force-pushed the support-arm_sve branch from b0c2296 to 49578a1 Compare June 26, 2023 14:39
@vorj
Copy link
Contributor Author

vorj commented Jun 28, 2023

@mdouze IMO the PRs should be separated, but I'm willing to include the commits of performance improvement in this PR if you want it. How would you like it?

@mdouze
Copy link
Contributor

mdouze commented Jun 29, 2023

Sorry for being a bit slow to react.
I think that it's fine to land this packaging PR first, let us check the implications in terms of library size.

@vorj
Copy link
Contributor Author

vorj commented Jun 30, 2023

@mdouze OK. When you will want my action like:

  • need me to make a decision,
  • need to change some codes, or
  • want to know my opinion,

please feel free to send me some comments. Anyway, I will wait the checking for a while. Thanks.

@naveentatikonda
Copy link
Contributor

naveentatikonda commented Sep 21, 2023

@mdouze and @vorj is there any update on adding SVE support and do you guys still have plans to add it? I saw some discussion on the other PR and there was no activity since a while. Basically, we were looking for some optimization to Scalar Quantization(specifically SQfp16) on ARM like AVX2 on x86.

Also, please let us know if you need any help to run tests for SVE support. We have bandwidth and resources to run tests. Thanks!

@vorj
Copy link
Contributor Author

vorj commented Sep 22, 2023

@naveentatikonda I am just a contributor not employed by Meta, so actually I don't know the plans on this (official faiss) repository. However, as I told above, I have further patches to improve performance more, and I will create PR when this merged.

@naveentatikonda
Copy link
Contributor

@mdouze and @vorj is there any update on adding SVE support and do you guys still have plans to add it? I saw some discussion on the other PR and there was no activity since a while. Basically, we were looking for some optimization to Scalar Quantization(specifically SQfp16) on ARM like AVX2 on x86.

Also, please let us know if you need any help to run tests for SVE support. We have bandwidth and resources to run tests. Thanks!

@mdouze Did you get a chance to look into my question?

@mdouze
Copy link
Contributor

mdouze commented Sep 26, 2023

OK so I think a way to move forward is to accept this PR but not cover it with CI.
Then optimized code for SVE can be contributed. At some point we will probably either:

  • add SVE to the CI or
  • remove SVE support if it turns out it is not used too much.

Is there a doc somewhere that shows what current and future ARM implementaitons support SVE ?

Thanks

@mdouze
Copy link
Contributor

mdouze commented Sep 26, 2023

Would you mind rebasing on the latest Faiss so that I can import it to the internal Faiss version?
Thanks

@alexanderguzhva
Copy link
Contributor

I can assist and review the code, if needed

@ramilbakhshyiev
Copy link
Contributor

@vorj Thanks! We will be trying this shortly. Meanwhile, I restarted the failed build, there was a transient error and it should be fixed now.

@vorj vorj force-pushed the support-arm_sve branch from 01992ce to f283e83 Compare June 24, 2024 07:55
@vorj
Copy link
Contributor Author

vorj commented Jul 10, 2024

@mengdilin

The failure is not reproducible on the main branch when building on a aarch64 platform using aws's r6g.large instance.

That's not surprising because what you are saying is like that faiss built with -DFAISS_OPT_LEVEL=avx512 doesn't work on AVX512-unsupported CPUs like AMD Zen3 or somthing like that.
As I had told that

SVE is an abbreviation of Scalable Vector Extension .

in the issue, Arm SVE is the extension of ARMv8 ISA, and AWS Graviton2 doesn't support it.
If you can, please try to use AWS Graviton3 as I had chose c7g.large instance (This is the simplest way).
If you cannot (I mean "when you must implement the CI on Graviton2, not Graviton3, for some (technically, economically, for company, and/or other) reasons"), you need to use QEMU or something like that for gtest.
Python interface can detect the features of running CPU dynamically, so faiss built with -DFAISS_OPT_LEVEL=sve should work on Graviton2 (it will just load _swigfaiss.so instead of _swigfaiss_sve.so).
However gtest binary is built based on the optimization flag, so it doesn't work on Graviton2 directly.

P.S. I'm not a Meta employee (unfortunately), so I can't see the internal URLs if you link it

@ramilbakhshyiev
Copy link
Contributor

@vorj Thanks! I'll let @mengdilin confirm but I believe this is resolved when it was retried with r8g.large (ARMv9 / Neoverse V2) which does support SVE and I believe SVE2 (something that might be of interest I guess).

@vorj
Copy link
Contributor Author

vorj commented Jul 11, 2024

Yes, Graviton >= 3 (including r7g.large and r8g.large) can solve above issue, so please take a try. When you will meet other problems, please let me know.


BTW, this PR activates only SVE but not SVE2, so when we want to use SVE2 we need to another PR (and finally it will generate another binary named _swigfaiss_sve2.so , _swigfaiss_armv9.so or something like that).

@ramilbakhshyiev
Copy link
Contributor

Yeah, @mengdilin tried with r8g and it worked.

Re: SVE2, is that something you would be interested in contributing? I think it’ll be much easier and quicker to merge that one in next.

@vorj
Copy link
Contributor Author

vorj commented Jul 11, 2024

SVE2, is that something you would be interested in contributing?

Yes, but currently I have not caught up that yet, so it might not be in the near future.

@facebook-github-bot
Copy link
Contributor

@mengdilin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mengdilin merged this pull request in 4eeaa42.

@vorj vorj deleted the support-arm_sve branch July 30, 2024 06:38
ketor pushed a commit to dingodb/faiss that referenced this pull request Aug 20, 2024
Summary:
related: facebookresearch#2884

This PR contains below changes:

- Add new optlevel `sve`
    - ARM SVE is _extension_ of ARMv8, so it should be treated similar to AVX2 IMO
- Add targets for ARM SVE, `faiss_sve` and `swigfaiss_sve`
    - These targets will be built when you give `-DFAISS_OPT_LEVEL=sve` at build time
    - Design decision: Don't fix SVE register length.
        - The python package of faiss is "fat binary" (for example, the package for avx2 contains `_swigfaiss_avx2.so` and `_swigfaiss.so`)
        - SVE is scalable instruction set (= doesn't fix vector length), but actually we can specify the vector length at compile time.
            - [with `-msve-vector-length=` option](https://developer.arm.com/documentation/101726/4-0/Coding-for-Scalable-Vector-Extension--SVE-/SVE-Vector-Length-Specific--VLS--programming)
            - When this option is specified, the binary can't work correctly on the CPU which has other vector length rather than specified at compile time
        - When we use fixed vector length, SVE-supported faiss python package will contain 7 shared libraries like `_swigfaiss.so` , `_swigfaiss_sve.so` , `_swigfaiss_sve128.so` , `_swigfaiss_sve256.so` , `_swigfaiss_sve512.so` , `_swigfaiss_sve1024.so` , and `_swigfaiss_sve2048.so` . The package size will be exploded.
        - For these reason, I don't specify the vector length at compile time and `faiss_sve` detects the vector length at run time.
- Add a mechanism of detecting ARM SVE on runtime environment and importing `swigfaiss_sve` dynamically
    - Currently it only supports Linux, but there is no SVE environment with non-Linux OS now, as far as I know

NOTE: I plan to make one more PR about add some SVE implementation after this PR merged. This PR only contains adding sve target.

Pull Request resolved: facebookresearch#2886

Reviewed By: ramilbakhshyiev

Differential Revision: D60386983

Pulled By: mengdilin

fbshipit-source-id: 7e66162ee53ce88fbfb6636e7bf705b44e6c3282
facebook-github-bot pushed a commit that referenced this pull request Oct 15, 2024
Summary:
#2943 had removed about SVE information (added on #2886 ) on the installation document. This PR fixes it.

This PR changes only the document, so it doesn't affect software behavior.

Pull Request resolved: #3915

Reviewed By: asadoughi

Differential Revision: D63967842

Pulled By: ramilbakhshyiev

fbshipit-source-id: ce0a0bfe591cb75b504cdf6362b5e8ed156928d5
aalekhpatel07 pushed a commit to aalekhpatel07/faiss that referenced this pull request Oct 17, 2024
Summary:
related: facebookresearch#2884

This PR contains below changes:

- Add new optlevel `sve`
    - ARM SVE is _extension_ of ARMv8, so it should be treated similar to AVX2 IMO
- Add targets for ARM SVE, `faiss_sve` and `swigfaiss_sve`
    - These targets will be built when you give `-DFAISS_OPT_LEVEL=sve` at build time
    - Design decision: Don't fix SVE register length.
        - The python package of faiss is "fat binary" (for example, the package for avx2 contains `_swigfaiss_avx2.so` and `_swigfaiss.so`)
        - SVE is scalable instruction set (= doesn't fix vector length), but actually we can specify the vector length at compile time.
            - [with `-msve-vector-length=` option](https://developer.arm.com/documentation/101726/4-0/Coding-for-Scalable-Vector-Extension--SVE-/SVE-Vector-Length-Specific--VLS--programming)
            - When this option is specified, the binary can't work correctly on the CPU which has other vector length rather than specified at compile time
        - When we use fixed vector length, SVE-supported faiss python package will contain 7 shared libraries like `_swigfaiss.so` , `_swigfaiss_sve.so` , `_swigfaiss_sve128.so` , `_swigfaiss_sve256.so` , `_swigfaiss_sve512.so` , `_swigfaiss_sve1024.so` , and `_swigfaiss_sve2048.so` . The package size will be exploded.
        - For these reason, I don't specify the vector length at compile time and `faiss_sve` detects the vector length at run time.
- Add a mechanism of detecting ARM SVE on runtime environment and importing `swigfaiss_sve` dynamically
    - Currently it only supports Linux, but there is no SVE environment with non-Linux OS now, as far as I know

NOTE: I plan to make one more PR about add some SVE implementation after this PR merged. This PR only contains adding sve target.

Pull Request resolved: facebookresearch#2886

Reviewed By: ramilbakhshyiev

Differential Revision: D60386983

Pulled By: mengdilin

fbshipit-source-id: 7e66162ee53ce88fbfb6636e7bf705b44e6c3282
aalekhpatel07 pushed a commit to aalekhpatel07/faiss that referenced this pull request Oct 17, 2024
…#3915)

Summary:
facebookresearch#2943 had removed about SVE information (added on facebookresearch#2886 ) on the installation document. This PR fixes it.

This PR changes only the document, so it doesn't affect software behavior.

Pull Request resolved: facebookresearch#3915

Reviewed By: asadoughi

Differential Revision: D63967842

Pulled By: ramilbakhshyiev

fbshipit-source-id: ce0a0bfe591cb75b504cdf6362b5e8ed156928d5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants