-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX512 for PQFastScan #3276
AVX512 for PQFastScan #3276
Conversation
Sorry for the late answer. |
@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
re-imported |
74e4ae9
to
182affc
Compare
@mdouze got some problems with raft, rebased on top of master |
Signed-off-by: Alexandr Guzhva <[email protected]>
182affc
to
515c0c6
Compare
@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: AVX-512 implementation for PQFastScan for QBS. For local benchmarks on 4th gen Xeon, the QPS is up to 10% higher, mostly for a single query case. But as far as I remember, production cases would show higher performance improvements. * Baseline `benchs/bench_ivf_fastscan_single_query.py` (sift1M): https://gist.github.com/alexanderguzhva/c9cde2cb5e9c7675f429623e6faa9fbf * Candidate `benchs/bench_ivf_fastscan_single_query.py` (sift1M): https://gist.github.com/alexanderguzhva/4e8530073a108f73771d38e55bc45b17 * Baseline `benchs/bench_ivf_fastscan.py` (sift1M): https://gist.github.com/alexanderguzhva/9eb03ed60354d7e76cfa25e676f983ac * Candidate `benchs/bench_ivf_fastscan.py` (sift1M): https://gist.github.com/alexanderguzhva/3cbfeba1364dd445a2bb52455966979e mdouze should I modify `pq4_fast_scan_search_1.cpp` as well? It is somewhat cumbersome to dig through various possible sub-implementations Pull Request resolved: facebookresearch#3276 Reviewed By: junjieqi Differential Revision: D54943632 Pulled By: mdouze fbshipit-source-id: 3d70066e9779039559b1734c2be99bf439058246
AVX-512 implementation for PQFastScan for QBS.
For local benchmarks on 4th gen Xeon, the QPS is up to 10% higher, mostly for a single query case. But as far as I remember, production cases would show higher performance improvements.
benchs/bench_ivf_fastscan_single_query.py
(sift1M): https://gist.github.com/alexanderguzhva/c9cde2cb5e9c7675f429623e6faa9fbfbenchs/bench_ivf_fastscan_single_query.py
(sift1M): https://gist.github.com/alexanderguzhva/4e8530073a108f73771d38e55bc45b17benchs/bench_ivf_fastscan.py
(sift1M): https://gist.github.com/alexanderguzhva/9eb03ed60354d7e76cfa25e676f983acbenchs/bench_ivf_fastscan.py
(sift1M): https://gist.github.com/alexanderguzhva/3cbfeba1364dd445a2bb52455966979e@mdouze should I modify
pq4_fast_scan_search_1.cpp
as well? It is somewhat cumbersome to dig through various possible sub-implementations