Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add some SVE implementations (facebookresearch#3933)
Summary: related: facebookresearch#2884 I added some SVE implementations of: - `code_distance` - `distance_single_code` - `distance_four_codes` - `exhaustive_L2sqr_blas_cmax_sve` - `fvec_inner_products_ny` - `fvec_madd` ## Evaluation result I evaluated the search for SIFT1M dataset on AWS EC2 c7g.large and r8g.large instances. `main` is the current (2e6551f) implementation. ### c7g.large (Graviton 3) ![g3_sift1m](https://github.com/user-attachments/assets/9c03cffa-72d1-4c77-9ae8-0ec0a5f5a6a5) ![g3_ivfpq](https://github.com/user-attachments/assets/4a8dfcc8-823c-4c31-ae79-3f4af9be28c8) On Graviton 3, `IndexIVFPQ` has been improved particularly. In the best case (IndexIVFPQ + IndexFlatL2, M: 32), this PR is approx. 2.38-~~2.50~~**2.44**x faster than `main` . - nprobe: 1, 0.069ms/query → 0.029ms/query - nprobe: 4, 0.181ms/query → ~~0.074~~**0.075**ms/query - nprobe: 16, 0.613ms/query → ~~0.245~~**0.251**ms/query ### r8g.large (Graviton 4) ![g4_sift1m](https://github.com/user-attachments/assets/e8510163-49d2-4143-babe-d406e2e40398) ![g4_ivfpq](https://github.com/user-attachments/assets/dc9a3ae0-a6b5-4a07-9898-c6aff372025c) On Graviton 4, especially `IndexIVFPQ` for tiny `nprobe` has been improved. In the best case (IndexIVFPQ + IndexFlatL2, M: 8, nprobe: 1), this PR is approx. 1.33x faster than `main` (0.016ms/query → 0.012ms/query). Pull Request resolved: facebookresearch#3933 Reviewed By: mengdilin Differential Revision: D64249808 Pulled By: asadoughi fbshipit-source-id: 8a625f0ab37732d330192599c851f864350885c4
- Loading branch information