From f6d35ae3611caa61c3e6885fa95da8a829d4492e Mon Sep 17 00:00:00 2001 From: "Artem M. Chirkin" <9253178+achirkin@users.noreply.github.com> Date: Wed, 30 Aug 2023 14:32:51 +0200 Subject: [PATCH] ANN-benchmarks: switch to use gbench (#1661) Make the ANN benchmarks use the same google benchmark infrastructure as the prim benchmarks while keeping the functional changes minimal. ### Overview - The command-line API largely stays the same, but enhanced with gbench-specific parameters, such as using regex to select algo configs, control the minimum run-time, and flexible reporting to console/files. - There's just one executable `ANN_BENCH`, all of the algorithms are loaded as shared libraries. The CPU-only components do not require cuda at runtime (ANN_BENCH itself, hnswlib). - Some dependencies are linked statically, it's possible to just copy the executable and the libs and run the benchmark on a linux machine with very few packages installed. - Search benchmarks do not produce any output anymore, they use ground truth files to compute and report the recall in-place. - Search/build parameters visible in the config files are passed as benchmark counters/labels/context. - Extra functionality: - `--data_prefix` to specify a custom path where the data sets are stored - `--index_prefix` to specify a custom path where the index sets are stored - `--override_kv=` override one or more parameters of search/build for parameter-sweep benchmarks __Breaking change__: the behavior of the ANN benchmark executables (library API is not touched). The executable CLI flags have changed, so the newer, adapted wrapper scripts won't work with the executables from the libraft-ann-bench-23.08 conda package. ### A primer ```bash ./cpp/build/ANN_BENCH \ # benchmark executable --data_prefix=/datastore/my/local/data/path \ # override (prefix) path to local data --benchmark_min_warmup_time=0.001 \ # spend some minimal time warming up --benchmark_min_time=3s \ # run minimum 3 seconds on each case --benchmark_out=ivf_pq.csv \ # duplicate output to this file --benchmark_out_format=csv \ # the file output should be in CSV format --benchmark_counters_tabular \ # the console output should be tabular --benchmark_filter="raft_ivf_pq\..*" \ # use regex to filter benchmarks --search \ # 'search' mode --override_kv=k:1:10:100:200:500 \ # Parameter-sweep over the top-k value --override_kv=n_queries:1:10:10000 \ # and the search batch size --override_kv=smemLutDtype:"fp8" \ # Override a search parameter cpp/bench/ann/conf/bigann-100M.json # specify the path to the config file ``` ### Motivation #### Eliminate huge bug-prone configs The current config fixes the batch size and k to one value per-config, so the whole config needs to be copied to try multiple values. In the PR, both these parameters can be overwritten in the search parameters and/or via command line (`ANN_BENCH --override_kv=n_queries:1:100:1000 --override_kv=k:1:10:20:50:100:200:500:1000` would test all combinations in one go). Any of the build/search parameters can be overwritten at the same time. #### Run the benchmarks and aggregate the data in the minimal environment The new executable generates reports with QPS, Recall, and other metrics using gbench. Hence there's no need to copy back and forth dozens of result files and no need to install python environment for running or evaluating. A single CSV or JSON can be produced for all algorithms and run configurations per dataset+hardware pair. #### Speedup the benchmarks The current benchmark framework is extremely slow due to two factors: - The dataset and the index need to be loaded for every test case, this takes orders of magnitude longer than the search test itself for large datasets. In my tests, the preparation phase for bigann-1B took ten minutes and the search could take anywhere between a few seconds and a minute. - The benchmark always goes through the whole query dataset. That is, if the query set is 10K and the batch size is 1, the benchmark repeats 10K times (to produce the result file for evaluating the recall). In the proposed solution, a user can set the desired time or number of iterations to run; the data is loaded only once and the index is cached between the search test cases. My subjective conservative estimate is the overall speedup of more than x100 for running a typical large-scale benchmark. #### Better measurement of QPS By default, the current benchmark reports the average execution time and does not warm-up iterations. As a result, the first test case on most of our plots is distorted (e.g. the first iteration of the first case takes about a second or two to run, and that significantly affects the average of the rest 999 ~100us iterations). `gbench` provides the `--benchmark_min_warmup_time` parameters to skip first one or few iterations, which solves the problem. #### Extra context in the reports The new benchmark executable uses gbench context to augment the report with some essential information: base and query set name, dimensionality, and size, distance metric, some CPU and GPU info, CUDA version. All this is appended directly to the generated CSV/JSON files, which makes the bookkeeping much easier. In addition, a user may pass extra context via command line `--benchmark_context==`; this could be e.g. the hostname, some ENV variables, etc. #### Easier profiling Thanks to flexible regex filtering and parameter overriding, now it's possible to specify a subset of cases and an exact number of times they should run. This makes the profiling using such tools as `nsys` and `ncu` much easier. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: https://github.com/rapidsai/raft/pull/1661 --- bench/ann/conf/bigann-100M.json | 248 ++- bench/ann/conf/deep-100M.json | 877 ++-------- bench/ann/conf/deep-1B.json | 36 +- bench/ann/conf/glove-100-inner.json | 512 +++--- bench/ann/conf/sift-128-euclidean.json | 1488 ++++------------- bench/ann/data_export.py | 48 +- bench/ann/run.py | 39 +- cpp/CMakeLists.txt | 2 +- cpp/bench/ann/CMakeLists.txt | 77 +- cpp/bench/ann/scripts/eval.pl | 430 ----- cpp/bench/ann/src/common/ann_types.hpp | 73 +- cpp/bench/ann/src/common/benchmark.cpp | 109 ++ cpp/bench/ann/src/common/benchmark.hpp | 934 +++++------ cpp/bench/ann/src/common/benchmark_util.hpp | 33 - cpp/bench/ann/src/common/conf.cpp | 152 -- cpp/bench/ann/src/common/conf.h | 76 - cpp/bench/ann/src/common/conf.hpp | 156 ++ cpp/bench/ann/src/common/cuda_stub.hpp | 159 ++ .../ann/src/common/{dataset.h => dataset.hpp} | 89 +- cpp/bench/ann/src/common/util.cpp | 68 - cpp/bench/ann/src/common/util.h | 79 - cpp/bench/ann/src/common/util.hpp | 347 ++++ cpp/bench/ann/src/faiss/faiss_benchmark.cu | 12 +- cpp/bench/ann/src/faiss/faiss_wrapper.h | 30 +- cpp/bench/ann/src/ggnn/ggnn_benchmark.cu | 14 +- cpp/bench/ann/src/ggnn/ggnn_wrapper.cuh | 18 +- .../ann/src/hnswlib/hnswlib_benchmark.cpp | 21 +- cpp/bench/ann/src/hnswlib/hnswlib_wrapper.h | 7 +- cpp/bench/ann/src/raft/raft_benchmark.cu | 42 +- cpp/bench/ann/src/raft/raft_cagra_wrapper.h | 65 +- .../ann/src/raft/raft_ivf_flat_wrapper.h | 44 +- cpp/bench/ann/src/raft/raft_ivf_pq_wrapper.h | 51 +- cpp/bench/ann/src/raft/raft_wrapper.h | 7 +- docs/source/ann_benchmarks_low_level.md | 264 +-- 34 files changed, 2590 insertions(+), 4017 deletions(-) delete mode 100755 cpp/bench/ann/scripts/eval.pl create mode 100644 cpp/bench/ann/src/common/benchmark.cpp delete mode 100644 cpp/bench/ann/src/common/benchmark_util.hpp delete mode 100644 cpp/bench/ann/src/common/conf.cpp delete mode 100644 cpp/bench/ann/src/common/conf.h create mode 100644 cpp/bench/ann/src/common/conf.hpp create mode 100644 cpp/bench/ann/src/common/cuda_stub.hpp rename cpp/bench/ann/src/common/{dataset.h => dataset.hpp} (85%) delete mode 100644 cpp/bench/ann/src/common/util.cpp delete mode 100644 cpp/bench/ann/src/common/util.h create mode 100644 cpp/bench/ann/src/common/util.hpp diff --git a/bench/ann/conf/bigann-100M.json b/bench/ann/conf/bigann-100M.json index 82e9383d15..c691c68299 100644 --- a/bench/ann/conf/bigann-100M.json +++ b/bench/ann/conf/bigann-100M.json @@ -1,80 +1,90 @@ { - "dataset" : { - "name" : "bigann-100M", - "base_file" : "data/bigann-1B/base.1B.u8bin", - "subset_size" : 100000000, - "query_file" : "data/bigann-1B/query.public.10K.u8bin", - "distance" : "euclidean" + "dataset": { + "name": "bigann-100M", + "base_file": "bigann-1B/base.1B.u8bin", + "subset_size": 100000000, + "query_file": "bigann-1B/query.public.10K.u8bin", + "groundtruth_neighbors_file": "bigann-100M/groundtruth.neighbors.ibin", + "distance": "euclidean" }, - "search_basic_param" : { - "batch_size" : 10000, - "k" : 10, - "run_count" : 2 + "search_basic_param": { + "batch_size": 10000, + "k": 10 }, - "index" : [ + "index": [ { - "name": "raft_ivf_pq.dimpq64-cluster5K-float-float", + "name": "raft_ivf_pq.dimpq64-cluster5K", "algo": "raft_ivf_pq", + "build_param": {"niter": 25, "nlist": 5000, "pq_dim": 64, "ratio": 10}, + "file": "bigann-100M/raft_ivf_pq/dimpq64-cluster5K", "dataset_memtype": "host", - "build_param": { - "niter": 25, - "nlist": 5000, - "pq_dim": 64, - "ratio": 10 - }, - "file": "index/bigann-100M/raft_ivf_pq/dimpq64-cluster5K", "search_params": [ - { - "numProbes": 20, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 30, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 40, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "numProbes": 1000, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - } - ], - "search_result_file": "result/bigann-100M/raft_ivf_pq/dimpq64-cluster5K-float-float" + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 20, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 30, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 40, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 50, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 100, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 200, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 500, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 1000, "internalDistanceDtype": "half", "smemLutDtype": "half" } + ] }, { - "name" : "hnswlib.M12", - "algo" : "hnswlib", + "name": "raft_ivf_pq.dimpq64-cluster10K", + "algo": "raft_ivf_pq", + "build_param": {"niter": 25, "nlist": 10000, "pq_dim": 64, "ratio": 10}, + "file": "bigann-100M/raft_ivf_pq/dimpq64-cluster5K", + "search_params": [ + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 20, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 30, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 40, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 50, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 100, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 200, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 500, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 1000, "internalDistanceDtype": "half", "smemLutDtype": "half" } + ] + }, + { + "name": "hnswlib.M12", + "algo": "hnswlib", "build_param": {"M":12, "efConstruction":500, "numThreads":32}, - "file" : "index/bigann-100M/hnswlib/M12", - "search_params" : [ + "file": "bigann-100M/hnswlib/M12", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -85,15 +95,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/bigann-100M/hnswlib/M12" + ] }, { - "name" : "hnswlib.M16", - "algo" : "hnswlib", + "name": "hnswlib.M16", + "algo": "hnswlib", "build_param": {"M":16, "efConstruction":500, "numThreads":32}, - "file" : "index/bigann-100M/hnswlib/M16", - "search_params" : [ + "file": "bigann-100M/hnswlib/M16", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -104,15 +113,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/bigann-100M/hnswlib/M16" + ] }, { - "name" : "hnswlib.M24", - "algo" : "hnswlib", + "name": "hnswlib.M24", + "algo": "hnswlib", "build_param": {"M":24, "efConstruction":500, "numThreads":32}, - "file" : "index/bigann-100M/hnswlib/M24", - "search_params" : [ + "file": "bigann-100M/hnswlib/M24", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -123,15 +131,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/bigann-100M/hnswlib/M24" + ] }, { - "name" : "hnswlib.M36", - "algo" : "hnswlib", + "name": "hnswlib.M36", + "algo": "hnswlib", "build_param": {"M":36, "efConstruction":500, "numThreads":32}, - "file" : "index/bigann-100M/hnswlib/M36", - "search_params" : [ + "file": "bigann-100M/hnswlib/M36", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -142,65 +149,48 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/bigann-100M/hnswlib/M36" + ] }, - - { - "name" : "raft_ivf_flat.nlist100K", - "algo" : "raft_ivf_flat", - "dataset_memtype": "host", - "build_param": { - "nlist" : 100000, - "niter" : 25, - "ratio" : 5 - }, - "file" : "index/bigann-100M/raft_ivf_flat/nlist100K", - "search_params" : [ - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/bigann-100M/raft_ivf_flat/nlist100K" + "name": "raft_ivf_flat.nlist100K", + "algo": "raft_ivf_flat", + "build_param": {"nlist": 100000, "niter": 25, "ratio": 5}, + "dataset_memtype":"host", + "file": "bigann-100M/raft_ivf_flat/nlist100K", + "search_params": [ + {"max_batch":10000, "max_k":10, "nprobe":20}, + {"max_batch":10000, "max_k":10, "nprobe":30}, + {"max_batch":10000, "max_k":10, "nprobe":40}, + {"max_batch":10000, "max_k":10, "nprobe":50}, + {"max_batch":10000, "max_k":10, "nprobe":100}, + {"max_batch":10000, "max_k":10, "nprobe":200}, + {"max_batch":10000, "max_k":10, "nprobe":500}, + {"max_batch":10000, "max_k":10, "nprobe":1000} + ] }, - { - "name" : "raft_cagra.dim32", - "algo" : "raft_cagra", + "name": "raft_cagra.dim32", + "algo": "raft_cagra", "dataset_memtype": "host", - "build_param": { - "index_dim" : 32 - }, - "file" : "index/bigann-100M/raft_cagra/dim32", - "search_params" : [ + "build_param": {"index_dim": 32}, + "file": "bigann-100M/raft_cagra/dim32", + "search_params": [ {"itopk": 32}, {"itopk": 64}, {"itopk": 128} - ], - "search_result_file" : "result/bigann-100M/raft_cagra/dim32" + ] }, - - { - "name" : "raft_cagra.dim64", - "algo" : "raft_cagra", - "dataset_memtype": "host", - "build_param": { - "index_dim" : 64 - }, - "file" : "index/bigann-100M/raft_cagra/dim64", - "search_params" : [ + "name": "raft_cagra.dim64", + "algo": "raft_cagra", + "dataset_memtype":"host", + "build_param": {"index_dim": 64}, + "file": "bigann-100M/raft_cagra/dim64", + "search_params": [ {"itopk": 32}, {"itopk": 64}, {"itopk": 128} - ], - "search_result_file" : "result/bigann-100M/raft_cagra/dim64" + ] } ] } diff --git a/bench/ann/conf/deep-100M.json b/bench/ann/conf/deep-100M.json index f3776b566e..6591957961 100644 --- a/bench/ann/conf/deep-100M.json +++ b/bench/ann/conf/deep-100M.json @@ -1,25 +1,25 @@ { - "dataset" : { - "name" : "deep-100M", - "base_file" : "data/deep-1B/base.1B.fbin", - "subset_size" : 100000000, - "query_file" : "data/deep-1B/query.public.10K.fbin", - "distance" : "euclidean" + "dataset": { + "name": "deep-100M", + "base_file": "data/deep-1B/base.1B.fbin", + "subset_size": 100000000, + "query_file": "data/deep-1B/query.public.10K.fbin", + "groundtruth_neighbors_file": "deep-100M/groundtruth.neighbors.ibin", + "distance": "euclidean" }, - "search_basic_param" : { - "batch_size" : 10000, - "k" : 10, - "run_count" : 2 + "search_basic_param": { + "batch_size": 10000, + "k": 10 }, - "index" : [ + "index": [ { - "name" : "hnswlib.M12", - "algo" : "hnswlib", + "name": "hnswlib.M12", + "algo": "hnswlib", "build_param": {"M":12, "efConstruction":500, "numThreads":32}, - "file" : "index/deep-100M/hnswlib/M12", - "search_params" : [ + "file": "deep-100M/hnswlib/M12", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -30,15 +30,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/deep-100M/hnswlib/M12" + ] }, { - "name" : "hnswlib.M16", - "algo" : "hnswlib", + "name": "hnswlib.M16", + "algo": "hnswlib", "build_param": {"M":16, "efConstruction":500, "numThreads":32}, - "file" : "index/deep-100M/hnswlib/M16", - "search_params" : [ + "file": "deep-100M/hnswlib/M16", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -49,15 +48,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/deep-100M/hnswlib/M16" + ] }, { - "name" : "hnswlib.M24", - "algo" : "hnswlib", + "name": "hnswlib.M24", + "algo": "hnswlib", "build_param": {"M":24, "efConstruction":500, "numThreads":32}, - "file" : "index/deep-100M/hnswlib/M24", - "search_params" : [ + "file": "deep-100M/hnswlib/M24", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -68,15 +66,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/deep-100M/hnswlib/M24" + ] }, { - "name" : "hnswlib.M36", - "algo" : "hnswlib", + "name": "hnswlib.M36", + "algo": "hnswlib", "build_param": {"M":36, "efConstruction":500, "numThreads":32}, - "file" : "index/deep-100M/hnswlib/M36", - "search_params" : [ + "file": "deep-100M/hnswlib/M36", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -87,15 +84,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/deep-100M/hnswlib/M36" + ] }, { - "name" : "faiss_ivf_flat.nlist50K", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist50K", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":50000}, - "file" : "index/deep-100M/faiss_ivf_flat/nlist50K", - "search_params" : [ + "file": "deep-100M/faiss_ivf_flat/nlist50K", + "search_params": [ {"nprobe":20}, {"nprobe":30}, {"nprobe":40}, @@ -104,15 +100,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/faiss_ivf_flat/nlist50K" + ] }, { - "name" : "faiss_ivf_flat.nlist100K", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist100K", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":100000}, - "file" : "index/deep-100M/faiss_ivf_flat/nlist100K", - "search_params" : [ + "file": "deep-100M/faiss_ivf_flat/nlist100K", + "search_params": [ {"nprobe":20}, {"nprobe":30}, {"nprobe":40}, @@ -121,15 +116,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/faiss_ivf_flat/nlist100K" + ] }, { - "name" : "faiss_ivf_flat.nlist200K", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist200K", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":200000}, - "file" : "index/deep-100M/faiss_ivf_flat/nlist200K", - "search_params" : [ + "file": "deep-100M/faiss_ivf_flat/nlist200K", + "search_params": [ {"nprobe":20}, {"nprobe":30}, {"nprobe":40}, @@ -138,17 +132,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/faiss_ivf_flat/nlist200K" + ] }, - - { - "name" : "faiss_ivf_pq.M48-nlist16K", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M48-nlist16K", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":16384, "M":48}, - "file" : "index/deep-100M/faiss_ivf_pq/M48-nlist16K", - "search_params" : [ + "file": "deep-100M/faiss_ivf_pq/M48-nlist16K", + "search_params": [ {"nprobe":10}, {"nprobe":20}, {"nprobe":30}, @@ -157,15 +148,14 @@ {"nprobe":100}, {"nprobe":200}, {"nprobe":500} - ], - "search_result_file" : "result/deep-100M/faiss_ivf_pq/M48-nlist16K" + ] }, { - "name" : "faiss_ivf_pq.M48-nlist50K", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M48-nlist50K", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":50000, "M":48}, - "file" : "index/deep-100M/faiss_ivf_pq/M48-nlist50K", - "search_params" : [ + "file": "deep-100M/faiss_ivf_pq/M48-nlist50K", + "search_params": [ {"nprobe":20}, {"nprobe":30}, {"nprobe":40}, @@ -174,15 +164,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/faiss_ivf_pq/M48-nlist50K" + ] }, { - "name" : "faiss_ivf_pq.M48-nlist100K", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M48-nlist100K", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":100000, "M":48}, - "file" : "index/deep-100M/faiss_ivf_pq/M48-nlist100K", - "search_params" : [ + "file": "deep-100M/faiss_ivf_pq/M48-nlist100K", + "search_params": [ {"nprobe":20}, {"nprobe":30}, {"nprobe":40}, @@ -191,669 +180,107 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/faiss_ivf_pq/M48-nlist100K" + ] }, - - { - "name" : "raft_ivf_flat.nlist10K", - "algo" : "raft_ivf_flat", - "dataset_memtype": "host", - "build_param": { - "nlist" : 10000, - "niter" : 25, - "ratio" : 5 - }, - "file" : "index/deep-100M/raft_ivf_flat/nlist10K", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_flat/nlist10K" + "name": "raft_ivf_flat.nlist100K", + "algo": "raft_ivf_flat", + "dataset_memtype":"host", + "build_param": {"nlist": 100000, "niter": 25, "ratio": 5}, + "file": "deep-100M/raft_ivf_flat/nlist100K", + "search_params": [ + {"max_batch":10000, "max_k":10, "nprobe":20}, + {"max_batch":10000, "max_k":10, "nprobe":30}, + {"max_batch":10000, "max_k":10, "nprobe":40}, + {"max_batch":10000, "max_k":10, "nprobe":50}, + {"max_batch":10000, "max_k":10, "nprobe":100}, + {"max_batch":10000, "max_k":10, "nprobe":200}, + {"max_batch":10000, "max_k":10, "nprobe":500}, + {"max_batch":10000, "max_k":10, "nprobe":1000} + ] }, { - "name" : "raft_ivf_flat.nlist100K", - "algo" : "raft_ivf_flat", - "dataset_memtype": "host", - "build_param": { - "nlist" : 100000, - "niter" : 25, - "ratio" : 5 - }, - "file" : "index/deep-100M/raft_ivf_flat/nlist100K", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_flat/nlist100K" + "name": "raft_cagra.dim32", + "algo": "raft_cagra", + "dataset_memtype":"host", + "build_param": {"index_dim": 32, "intermediate_graph_degree": 48}, + "file": "deep-100M/raft_cagra/dim32", + "search_params": [ + {"itopk": 32, "search_width": 1, "max_iterations": 0, "algo": "single_cta"}, + {"itopk": 32, "search_width": 1, "max_iterations": 32, "algo": "single_cta"}, + {"itopk": 64, "search_width": 4, "max_iterations": 16, "algo": "single_cta"}, + {"itopk": 64, "search_width": 1, "max_iterations": 64, "algo": "single_cta"}, + {"itopk": 96, "search_width": 2, "max_iterations": 48, "algo": "single_cta"}, + {"itopk": 128, "search_width": 8, "max_iterations": 16, "algo": "single_cta"}, + {"itopk": 128, "search_width": 2, "max_iterations": 64, "algo": "single_cta"}, + {"itopk": 192, "search_width": 8, "max_iterations": 24, "algo": "single_cta"}, + {"itopk": 192, "search_width": 2, "max_iterations": 96, "algo": "single_cta"}, + {"itopk": 256, "search_width": 8, "max_iterations": 32, "algo": "single_cta"}, + {"itopk": 384, "search_width": 8, "max_iterations": 48, "algo": "single_cta"}, + {"itopk": 512, "search_width": 8, "max_iterations": 64, "algo": "single_cta"} + ] }, - { - "name" : "raft_ivf_pq.nlist10K", - "algo" : "raft_ivf_pq", - "dataset_memtype": "host", - "build_param": { - "nlist" : 10000, - "niter" : 25, - "ratio" : 5 - }, - "file" : "index/deep-100M/raft_ivf_pq/nlist10K", - "search_params" : [ - {"nprobe":3}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_pq/nlist10K" - }, + "name": "raft_cagra.dim32.multi_cta", + "algo": "raft_cagra", + "dataset_memtype":"host", + "build_param": {"index_dim": 32, "intermediate_graph_degree": 48}, + "file": "deep-100M/raft_cagra/dim32", + "search_params": [ + {"itopk": 32, "search_width": 1, "max_iterations": 0, "algo": "multi_cta"}, + {"itopk": 32, "search_width": 1, "max_iterations": 32, "algo": "multi_cta"}, + {"itopk": 64, "search_width": 4, "max_iterations": 16, "algo": "multi_cta"}, + {"itopk": 64, "search_width": 1, "max_iterations": 64, "algo": "multi_cta"}, + {"itopk": 96, "search_width": 2, "max_iterations": 48, "algo": "multi_cta"}, + {"itopk": 128, "search_width": 8, "max_iterations": 16, "algo": "multi_cta"}, + {"itopk": 128, "search_width": 2, "max_iterations": 64, "algo": "multi_cta"}, + {"itopk": 192, "search_width": 8, "max_iterations": 24, "algo": "multi_cta"}, + {"itopk": 192, "search_width": 2, "max_iterations": 96, "algo": "multi_cta"}, + {"itopk": 256, "search_width": 8, "max_iterations": 32, "algo": "multi_cta"}, + {"itopk": 384, "search_width": 8, "max_iterations": 48, "algo": "multi_cta"}, + {"itopk": 512, "search_width": 8, "max_iterations": 64, "algo": "multi_cta"} + ] { - "name" : "raft_ivf_pq.nlist10Kdim64", - "algo" : "raft_ivf_pq", - "dataset_memtype": "host", - "build_param": { - "nlist" : 10000, - "niter" : 25, - "ratio" : 5, - "pq_dim": 64 - }, - "file" : "index/deep-100M/raft_ivf_pq/nlist10Kdim64", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_pq/nlist10Kdim64" + "name": "raft_cagra.dim32.multi_kernel", + "algo": "raft_cagra", + "dataset_memtype":"host", + "build_param": {"index_dim": 32, "intermediate_graph_degree": 48}, + "file": "deep-100M/raft_cagra/dim32", + "search_params": [ + {"itopk": 32, "search_width": 1, "max_iterations": 0, "algo": "multi_kernel"}, + {"itopk": 32, "search_width": 1, "max_iterations": 32, "algo": "multi_kernel"}, + {"itopk": 64, "search_width": 4, "max_iterations": 16, "algo": "multi_kernel"}, + {"itopk": 64, "search_width": 1, "max_iterations": 64, "algo": "multi_kernel"}, + {"itopk": 96, "search_width": 2, "max_iterations": 48, "algo": "multi_kernel"}, + {"itopk": 128, "search_width": 8, "max_iterations": 16, "algo": "multi_kernel"}, + {"itopk": 128, "search_width": 2, "max_iterations": 64, "algo": "multi_kernel"}, + {"itopk": 192, "search_width": 8, "max_iterations": 24, "algo": "multi_kernel"}, + {"itopk": 192, "search_width": 2, "max_iterations": 96, "algo": "multi_kernel"}, + {"itopk": 256, "search_width": 8, "max_iterations": 32, "algo": "multi_kernel"}, + {"itopk": 384, "search_width": 8, "max_iterations": 48, "algo": "multi_kernel"}, + {"itopk": 512, "search_width": 8, "max_iterations": 64, "algo": "multi_kernel"} + ] }, { - "name" : "raft_ivf_pq.nlist10Kdim32", - "algo" : "raft_ivf_pq", - "dataset_memtype": "host", - "build_param": { - "nlist" : 10000, - "niter" : 25, - "ratio" : 5, - "pq_dim": 32 - }, - "file" : "index/deep-100M/raft_ivf_pq/nlist10Kdim32", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_pq/nlist10Kdim32" - }, - { - "name" : "raft_ivf_pq.nlist100K", - "algo" : "raft_ivf_pq", - "dataset_memtype": "host", - "build_param": { - "nlist" : 100000, - "niter" : 25, - "ratio" : 5 - }, - "file" : "index/deep-100M/raft_ivf_pq/nlist100K", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_pq/nlist100K" - }, - { - "name" : "raft_ivf_pq.nlist100Kdim64", - "algo" : "raft_ivf_pq", - "dataset_memtype": "host", - "build_param": { - "nlist" : 100000, - "niter" : 25, - "ratio" : 5, - "pq_dim": 64 - }, - "file" : "index/deep-100M/raft_ivf_pq/nlist100Kdim64", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_pq/nlist100Kdim64" - }, - { - "name" : "raft_ivf_pq.nlist100Kdim32", - "algo" : "raft_ivf_pq", - "dataset_memtype": "host", - "build_param": { - "nlist" : 100000, - "niter" : 25, - "ratio" : 5, - "pq_dim": 32 - }, - "file" : "index/deep-100M/raft_ivf_pq/nlist100Kdim32", - "search_params" : [ - {"nprobe":5}, - {"nprobe":10}, - {"nprobe":20}, - {"nprobe":30}, - {"nprobe":40}, - {"nprobe":50}, - {"nprobe":100}, - {"nprobe":200}, - {"nprobe":500}, - {"nprobe":1000} - ], - "search_result_file" : "result/deep-100M/raft_ivf_pq/nlist100Kdim32" - }, - - { - "name" : "raft_cagra.dim32", - "algo" : "raft_cagra", - "dataset_memtype": "host", - "build_param": { - "index_dim": 32, - "intermediate_graph_degree": 48 - }, - "file": "index/deep-100M/raft_cagra/dim32", - "search_params" : [ - { - "itopk": 32, - "search_width": 1, - "max_iterations": 0, - "algo": "single_cta" - }, - { - "itopk": 32, - "search_width": 1, - "max_iterations": 32, - "algo": "single_cta" - }, - { - "itopk": 64, - "search_width": 4, - "max_iterations": 16, - "algo": "single_cta" - }, - { - "itopk": 64, - "search_width": 1, - "max_iterations": 64, - "algo": "single_cta" - }, - { - "itopk": 96, - "search_width": 2, - "max_iterations": 48, - "algo": "single_cta" - }, - { - "itopk": 128, - "search_width": 8, - "max_iterations": 16, - "algo": "single_cta" - }, - { - "itopk": 128, - "search_width": 2, - "max_iterations": 64, - "algo": "single_cta" - }, - { - "itopk": 192, - "search_width": 8, - "max_iterations": 24, - "algo": "single_cta" - }, - { - "itopk": 192, - "search_width": 2, - "max_iterations": 96, - "algo": "single_cta" - }, - { - "itopk": 256, - "search_width": 8, - "max_iterations": 32, - "algo": "single_cta" - }, - { - "itopk": 384, - "search_width": 8, - "max_iterations": 48, - "algo": "single_cta" - }, - { - "itopk": 512, - "search_width": 8, - "max_iterations": 64, - "algo": "single_cta" - }, - - { - "itopk": 32, - "search_width": 1, - "max_iterations": 0, - "algo": "multi_cta" - }, - { - "itopk": 32, - "search_width": 1, - "max_iterations": 32, - "algo": "multi_cta" - }, - { - "itopk": 64, - "search_width": 4, - "max_iterations": 16, - "algo": "multi_cta" - }, - { - "itopk": 64, - "search_width": 1, - "max_iterations": 64, - "algo": "multi_cta" - }, - { - "itopk": 96, - "search_width": 2, - "max_iterations": 48, - "algo": "multi_cta" - }, - { - "itopk": 128, - "search_width": 8, - "max_iterations": 16, - "algo": "multi_cta" - }, - { - "itopk": 128, - "search_width": 2, - "max_iterations": 64, - "algo": "multi_cta" - }, - { - "itopk": 192, - "search_width": 8, - "max_iterations": 24, - "algo": "multi_cta" - }, - { - "itopk": 192, - "search_width": 2, - "max_iterations": 96, - "algo": "multi_cta" - }, - { - "itopk": 256, - "search_width": 8, - "max_iterations": 32, - "algo": "multi_cta" - }, - { - "itopk": 384, - "search_width": 8, - "max_iterations": 48, - "algo": "multi_cta" - }, - { - "itopk": 512, - "search_width": 8, - "max_iterations": 64, - "algo": "multi_cta" - }, - - { - "itopk": 32, - "search_width": 1, - "max_iterations": 0, - "algo": "multi_kernel" - }, - { - "itopk": 32, - "search_width": 1, - "max_iterations": 32, - "algo": "multi_kernel" - }, - { - "itopk": 64, - "search_width": 4, - "max_iterations": 16, - "algo": "multi_kernel" - }, - { - "itopk": 64, - "search_width": 1, - "max_iterations": 64, - "algo": "multi_kernel" - }, - { - "itopk": 96, - "search_width": 2, - "max_iterations": 48, - "algo": "multi_kernel" - }, - { - "itopk": 128, - "search_width": 8, - "max_iterations": 16, - "algo": "multi_kernel" - }, - { - "itopk": 128, - "search_width": 2, - "max_iterations": 64, - "algo": "multi_kernel" - }, - { - "itopk": 192, - "search_width": 8, - "max_iterations": 24, - "algo": "multi_kernel" - }, - { - "itopk": 192, - "search_width": 2, - "max_iterations": 96, - "algo": "multi_kernel" - }, - { - "itopk": 256, - "search_width": 8, - "max_iterations": 32, - "algo": "multi_kernel" - }, - { - "itopk": 384, - "search_width": 8, - "max_iterations": 48, - "algo": "multi_kernel" - }, - { - "itopk": 512, - "search_width": 8, - "max_iterations": 64, - "algo": "multi_kernel" - } - ], - "search_result_file": "result/deep-100M/raft_cagra/dim32" - }, - { - "name" : "raft_cagra.dim64", - "algo" : "raft_cagra", - "dataset_memtype": "host", - "build_param": { - "index_dim": 64 - }, - "file": "index/deep-100M/raft_cagra/dim64", - "search_params" : [ - { - "itopk": 32, - "search_width": 1, - "max_iterations": 0, - "algo": "single_cta" - }, - { - "itopk": 32, - "search_width": 1, - "max_iterations": 32, - "algo": "single_cta" - }, - { - "itopk": 64, - "search_width": 4, - "max_iterations": 16, - "algo": "single_cta" - }, - { - "itopk": 64, - "search_width": 1, - "max_iterations": 64, - "algo": "single_cta" - }, - { - "itopk": 96, - "search_width": 2, - "max_iterations": 48, - "algo": "single_cta" - }, - { - "itopk": 128, - "search_width": 8, - "max_iterations": 16, - "algo": "single_cta" - }, - { - "itopk": 128, - "search_width": 2, - "max_iterations": 64, - "algo": "single_cta" - }, - { - "itopk": 192, - "search_width": 8, - "max_iterations": 24, - "algo": "single_cta" - }, - { - "itopk": 192, - "search_width": 2, - "max_iterations": 96, - "algo": "single_cta" - }, - { - "itopk": 256, - "search_width": 8, - "max_iterations": 32, - "algo": "single_cta" - }, - { - "itopk": 384, - "search_width": 8, - "max_iterations": 48, - "algo": "single_cta" - }, - { - "itopk": 512, - "search_width": 8, - "max_iterations": 64, - "algo": "single_cta" - }, - - { - "itopk": 32, - "search_width": 1, - "max_iterations": 0, - "algo": "multi_cta" - }, - { - "itopk": 32, - "search_width": 1, - "max_iterations": 32, - "algo": "multi_cta" - }, - { - "itopk": 64, - "search_width": 4, - "max_iterations": 16, - "algo": "multi_cta" - }, - { - "itopk": 64, - "search_width": 1, - "max_iterations": 64, - "algo": "multi_cta" - }, - { - "itopk": 96, - "search_width": 2, - "max_iterations": 48, - "algo": "multi_cta" - }, - { - "itopk": 128, - "search_width": 8, - "max_iterations": 16, - "algo": "multi_cta" - }, - { - "itopk": 128, - "search_width": 2, - "max_iterations": 64, - "algo": "multi_cta" - }, - { - "itopk": 192, - "search_width": 8, - "max_iterations": 24, - "algo": "multi_cta" - }, - { - "itopk": 192, - "search_width": 2, - "max_iterations": 96, - "algo": "multi_cta" - }, - { - "itopk": 256, - "search_width": 8, - "max_iterations": 32, - "algo": "multi_cta" - }, - { - "itopk": 384, - "search_width": 8, - "max_iterations": 48, - "algo": "multi_cta" - }, - { - "itopk": 512, - "search_width": 8, - "max_iterations": 64, - "algo": "multi_cta" - }, - - { - "itopk": 32, - "search_width": 1, - "max_iterations": 0, - "algo": "multi_kernel" - }, - { - "itopk": 32, - "search_width": 1, - "max_iterations": 32, - "algo": "multi_kernel" - }, - { - "itopk": 64, - "search_width": 4, - "max_iterations": 16, - "algo": "multi_kernel" - }, - { - "itopk": 64, - "search_width": 1, - "max_iterations": 64, - "algo": "multi_kernel" - }, - { - "itopk": 96, - "search_width": 2, - "max_iterations": 48, - "algo": "multi_kernel" - }, - { - "itopk": 128, - "search_width": 8, - "max_iterations": 16, - "algo": "multi_kernel" - }, - { - "itopk": 128, - "search_width": 2, - "max_iterations": 64, - "algo": "multi_kernel" - }, - { - "itopk": 192, - "search_width": 8, - "max_iterations": 24, - "algo": "multi_kernel" - }, - { - "itopk": 192, - "search_width": 2, - "max_iterations": 96, - "algo": "multi_kernel" - }, - { - "itopk": 256, - "search_width": 8, - "max_iterations": 32, - "algo": "multi_kernel" - }, - { - "itopk": 384, - "search_width": 8, - "max_iterations": 48, - "algo": "multi_kernel" - }, - { - "itopk": 512, - "search_width": 8, - "max_iterations": 64, - "algo": "multi_kernel" - } - ], - "search_result_file": "result/deep-100M/raft_cagra/dim64" + "name": "raft_cagra.dim64", + "algo": "raft_cagra", + "dataset_memtype":"host", + "build_param": {"index_dim": 64}, + "file": "deep-100M/raft_cagra/dim64", + "search_params": [ + {"itopk": 32, "search_width": 1, "max_iterations": 0}, + {"itopk": 32, "search_width": 1, "max_iterations": 32}, + {"itopk": 64, "search_width": 4, "max_iterations": 16}, + {"itopk": 64, "search_width": 1, "max_iterations": 64}, + {"itopk": 96, "search_width": 2, "max_iterations": 48}, + {"itopk": 128, "search_width": 8, "max_iterations": 16}, + {"itopk": 128, "search_width": 2, "max_iterations": 64}, + {"itopk": 192, "search_width": 8, "max_iterations": 24}, + {"itopk": 192, "search_width": 2, "max_iterations": 96}, + {"itopk": 256, "search_width": 8, "max_iterations": 32}, + {"itopk": 384, "search_width": 8, "max_iterations": 48}, + {"itopk": 512, "search_width": 8, "max_iterations": 64} + ] } ] } diff --git a/bench/ann/conf/deep-1B.json b/bench/ann/conf/deep-1B.json index 50d1b87602..632d2f7308 100644 --- a/bench/ann/conf/deep-1B.json +++ b/bench/ann/conf/deep-1B.json @@ -1,25 +1,24 @@ { - "dataset" : { - "name" : "deep-1B", - "base_file" : "data/deep-1B/base.1B.fbin", - "query_file" : "data/deep-1B/query.public.10K.fbin", - // although distance should be "euclidean", faiss becomes much slower for that - "distance" : "inner_product" + "dataset": { + "name": "deep-1B", + "base_file": "deep-1B/base.1B.fbin", + "query_file": "deep-1B/query.public.10K.fbin", + "groundtruth_neighbors_file": "deep-1B/groundtruth.neighbors.ibin", + "distance": "inner_product" }, - "search_basic_param" : { - "batch_size" : 10000, - "k" : 10, - "run_count" : 2 + "search_basic_param": { + "batch_size": 10000, + "k": 10 }, - "index" : [ + "index": [ { - "name" : "faiss_ivf_pq.M48-nlist50K", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M48-nlist50K", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":50000, "M":48}, - "file" : "index/deep-1B/faiss_ivf_pq/M48-nlist50K", - "search_params" : [ + "file": "deep-1B/faiss_ivf_pq/M48-nlist50K", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -29,10 +28,7 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/deep-1B/faiss_ivf_pq/M48-nlist50K" - }, - - + ] + } ] } diff --git a/bench/ann/conf/glove-100-inner.json b/bench/ann/conf/glove-100-inner.json index 5d0bbf970c..7c95ceb439 100644 --- a/bench/ann/conf/glove-100-inner.json +++ b/bench/ann/conf/glove-100-inner.json @@ -1,24 +1,24 @@ { - "dataset" : { - "name" : "glove-100-inner", - "base_file" : "data/glove-100-inner/base.fbin", - "query_file" : "data/glove-100-inner/query.fbin", - "distance" : "inner_product" + "dataset": { + "name": "glove-100-inner", + "base_file": "glove-100-inner/base.fbin", + "query_file": "glove-100-inner/query.fbin", + "groundtruth_neighbors_file": "glove-100-inner/groundtruth.neighbors.ibin", + "distance": "inner_product" }, - "search_basic_param" : { - "batch_size" : 1, - "k" : 10, - "run_count" : 3 + "search_basic_param": { + "batch_size": 1, + "k": 10 }, - "index" : [ + "index": [ { - "name" : "hnswlib.M4", - "algo" : "hnswlib", + "name": "hnswlib.M4", + "algo": "hnswlib", "build_param": {"M":4, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M4", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M4", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -28,16 +28,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M4" + ] }, - { - "name" : "hnswlib.M8", - "algo" : "hnswlib", + "name": "hnswlib.M8", + "algo": "hnswlib", "build_param": {"M":8, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M8", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M8", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -47,16 +45,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M8" + ] }, - { - "name" : "hnswlib.M12", - "algo" : "hnswlib", + "name": "hnswlib.M12", + "algo": "hnswlib", "build_param": {"M":12, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M12", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M12", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -66,16 +62,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M12" + ] }, - { - "name" : "hnswlib.M16", - "algo" : "hnswlib", + "name": "hnswlib.M16", + "algo": "hnswlib", "build_param": {"M":16, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M16", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M16", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -85,16 +79,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M16" + ] }, - { - "name" : "hnswlib.M24", - "algo" : "hnswlib", + "name": "hnswlib.M24", + "algo": "hnswlib", "build_param": {"M":24, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M24", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M24", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -104,16 +96,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M24" + ] }, - { - "name" : "hnswlib.M36", - "algo" : "hnswlib", + "name": "hnswlib.M36", + "algo": "hnswlib", "build_param": {"M":36, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M36", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M36", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -123,16 +113,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M36" + ] }, - { - "name" : "hnswlib.M48", - "algo" : "hnswlib", + "name": "hnswlib.M48", + "algo": "hnswlib", "build_param": {"M":48, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M48", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M48", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -142,16 +130,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M48" + ] }, - { - "name" : "hnswlib.M64", - "algo" : "hnswlib", + "name": "hnswlib.M64", + "algo": "hnswlib", "build_param": {"M":64, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M64", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M64", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -161,16 +147,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M64" + ] }, - { - "name" : "hnswlib.M96", - "algo" : "hnswlib", + "name": "hnswlib.M96", + "algo": "hnswlib", "build_param": {"M":96, "efConstruction":500, "numThreads":4}, - "file" : "index/glove-100-inner/hnswlib/M96", - "search_params" : [ + "file": "glove-100-inner/hnswlib/M96", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -180,16 +164,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/glove-100-inner/hnswlib/M96" + ] }, - { - "name" : "faiss_ivf_flat.nlist1024", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist1024", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":1024}, - "file" : "index/glove-100-inner/faiss_ivf_flat/nlist1024", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_flat/nlist1024", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -198,16 +180,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_flat/nlist1024" + ] }, - { - "name" : "faiss_ivf_flat.nlist2048", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist2048", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":2048}, - "file" : "index/glove-100-inner/faiss_ivf_flat/nlist2048", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_flat/nlist2048", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -216,16 +196,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_flat/nlist2048" + ] }, - { - "name" : "faiss_ivf_flat.nlist4096", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist4096", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":4096}, - "file" : "index/glove-100-inner/faiss_ivf_flat/nlist4096", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_flat/nlist4096", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -234,16 +212,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_flat/nlist4096" + ] }, - { - "name" : "faiss_ivf_flat.nlist8192", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist8192", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":8192}, - "file" : "index/glove-100-inner/faiss_ivf_flat/nlist8192", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_flat/nlist8192", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -252,16 +228,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_flat/nlist8192" + ] }, - { - "name" : "faiss_ivf_flat.nlist16384", - "algo" : "faiss_gpu_ivf_flat", + "name": "faiss_ivf_flat.nlist16384", + "algo": "faiss_gpu_ivf_flat", "build_param": {"nlist":16384}, - "file" : "index/glove-100-inner/faiss_ivf_flat/nlist16384", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_flat/nlist16384", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -271,18 +245,17 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_flat/nlist16384" + ] }, { - "name" : "faiss_ivf_pq.M2-nlist1024", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M2-nlist1024", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":1024, "M":2}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M2-nlist1024", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M2-nlist1024", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -291,16 +264,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M2-nlist1024" + ] }, - { - "name" : "faiss_ivf_pq.M2-nlist2048", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M2-nlist2048", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":2048, "M":2}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M2-nlist2048", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M2-nlist2048", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -309,16 +280,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M2-nlist2048" + ] }, - { - "name" : "faiss_ivf_pq.M2-nlist4096", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M2-nlist4096", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":4096, "M":2}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M2-nlist4096", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M2-nlist4096", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -327,16 +296,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M2-nlist4096" + ] }, - { - "name" : "faiss_ivf_pq.M2-nlist8192", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M2-nlist8192", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":8192, "M":2}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M2-nlist8192", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M2-nlist8192", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -345,16 +312,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M2-nlist8192" + ] }, - { - "name" : "faiss_ivf_pq.M2-nlist16384", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M2-nlist16384", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":16384, "M":2}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M2-nlist16384", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M2-nlist16384", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -364,16 +329,14 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M2-nlist16384" + ] }, - { - "name" : "faiss_ivf_pq.M4-nlist1024", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M4-nlist1024", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":1024, "M":4}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M4-nlist1024", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M4-nlist1024", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -382,16 +345,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M4-nlist1024" + ] }, - { - "name" : "faiss_ivf_pq.M4-nlist2048", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M4-nlist2048", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":2048, "M":4}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M4-nlist2048", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M4-nlist2048", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -400,16 +361,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M4-nlist2048" + ] }, - { - "name" : "faiss_ivf_pq.M4-nlist4096", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M4-nlist4096", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":4096, "M":4}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M4-nlist4096", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M4-nlist4096", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -418,16 +377,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M4-nlist4096" + ] }, - { - "name" : "faiss_ivf_pq.M4-nlist8192", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M4-nlist8192", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":8192, "M":4}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M4-nlist8192", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M4-nlist8192", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -436,16 +393,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M4-nlist8192" + ] }, - { - "name" : "faiss_ivf_pq.M4-nlist16384", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M4-nlist16384", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":16384, "M":4}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M4-nlist16384", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M4-nlist16384", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -455,16 +410,14 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M4-nlist16384" + ] }, - { - "name" : "faiss_ivf_pq.M20-nlist1024", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M20-nlist1024", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":1024, "M":20}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M20-nlist1024", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M20-nlist1024", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -473,16 +426,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M20-nlist1024" + ] }, - { - "name" : "faiss_ivf_pq.M20-nlist2048", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M20-nlist2048", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":2048, "M":20}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M20-nlist2048", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M20-nlist2048", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -491,16 +442,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M20-nlist2048" + ] }, - { - "name" : "faiss_ivf_pq.M20-nlist4096", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M20-nlist4096", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":4096, "M":20}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M20-nlist4096", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M20-nlist4096", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -509,16 +458,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M20-nlist4096" + ] }, - { - "name" : "faiss_ivf_pq.M20-nlist8192", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M20-nlist8192", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":8192, "M":20}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M20-nlist8192", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M20-nlist8192", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -527,16 +474,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M20-nlist8192" + ] }, - { - "name" : "faiss_ivf_pq.M20-nlist16384", - "algo" : "faiss_gpu_ivf_pq", + "name": "faiss_ivf_pq.M20-nlist16384", + "algo": "faiss_gpu_ivf_pq", "build_param": {"nlist":16384, "M":20}, - "file" : "index/glove-100-inner/faiss_ivf_pq/M20-nlist16384", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_pq/M20-nlist16384", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -546,17 +491,16 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_pq/M20-nlist16384" + ] }, { - "name" : "faiss_ivf_sq.nlist1024-fp16", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist1024-fp16", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":1024, "quantizer_type":"fp16"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist1024-fp16", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist1024-fp16", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -565,16 +509,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist1024-fp16" + ] }, - { - "name" : "faiss_ivf_sq.nlist2048-fp16", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist2048-fp16", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":2048, "quantizer_type":"fp16"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist2048-fp16", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist2048-fp16", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -583,16 +525,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist2048-fp16" + ] }, - { - "name" : "faiss_ivf_sq.nlist4096-fp16", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist4096-fp16", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":4096, "quantizer_type":"fp16"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist4096-fp16", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist4096-fp16", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -601,16 +541,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist4096-fp16" + ] }, - { - "name" : "faiss_ivf_sq.nlist8192-fp16", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist8192-fp16", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":8192, "quantizer_type":"fp16"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist8192-fp16", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist8192-fp16", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -619,16 +557,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist8192-fp16" + ] }, - { - "name" : "faiss_ivf_sq.nlist16384-fp16", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist16384-fp16", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":16384, "quantizer_type":"fp16"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist16384-fp16", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist16384-fp16", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -638,17 +574,14 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist16384-fp16" + ] }, - - { - "name" : "faiss_ivf_sq.nlist1024-int8", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist1024-int8", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":1024, "quantizer_type":"int8"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist1024-int8", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist1024-int8", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -657,16 +590,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist1024-int8" + ] }, - { - "name" : "faiss_ivf_sq.nlist2048-int8", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist2048-int8", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":2048, "quantizer_type":"int8"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist2048-int8", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist2048-int8", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -675,16 +606,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist2048-int8" + ] }, - { - "name" : "faiss_ivf_sq.nlist4096-int8", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist4096-int8", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":4096, "quantizer_type":"int8"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist4096-int8", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist4096-int8", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -693,16 +622,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist4096-int8" + ] }, - { - "name" : "faiss_ivf_sq.nlist8192-int8", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist8192-int8", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":8192, "quantizer_type":"int8"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist8192-int8", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist8192-int8", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -711,16 +638,14 @@ {"nprobe":200}, {"nprobe":500}, {"nprobe":1000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist8192-int8" + ] }, - { - "name" : "faiss_ivf_sq.nlist16384-int8", - "algo" : "faiss_gpu_ivf_sq", + "name": "faiss_ivf_sq.nlist16384-int8", + "algo": "faiss_gpu_ivf_sq", "build_param": {"nlist":16384, "quantizer_type":"int8"}, - "file" : "index/glove-100-inner/faiss_ivf_sq/nlist16384-int8", - "search_params" : [ + "file": "glove-100-inner/faiss_ivf_sq/nlist16384-int8", + "search_params": [ {"nprobe":1}, {"nprobe":5}, {"nprobe":10}, @@ -730,22 +655,18 @@ {"nprobe":500}, {"nprobe":1000}, {"nprobe":2000} - ], - "search_result_file" : "result/glove-100-inner/faiss_ivf_sq/nlist16384-int8" + ] }, - { - "name" : "faiss_flat", - "algo" : "faiss_gpu_flat", + "name": "faiss_flat", + "algo": "faiss_gpu_flat", "build_param": {}, - "file" : "index/glove-100-inner/faiss_flat/flat", - "search_params" : [{}], - "search_result_file" : "result/glove-100-inner/faiss_flat/flat" + "file": "glove-100-inner/faiss_flat/flat", + "search_params": [{}] }, - { - "name" : "ggnn.kbuild96-segment64-refine2-k10", - "algo" : "ggnn", + "name": "ggnn.kbuild96-segment64-refine2-k10", + "algo": "ggnn", "build_param": { "k_build": 96, "segment_size": 64, @@ -753,8 +674,8 @@ "dataset_size": 1183514, "k": 10 }, - "file" : "index/glove-100-inner/ggnn/kbuild96-segment64-refine2-k10", - "search_params" : [ + "file": "glove-100-inner/ggnn/kbuild96-segment64-refine2-k10", + "search_params": [ {"tau":0.001, "block_dim":64, "sorted_size":32}, {"tau":0.005, "block_dim":64, "sorted_size":32}, {"tau":0.01, "block_dim":64, "sorted_size":32}, @@ -786,8 +707,7 @@ {"tau":0.3, "block_dim":128, "max_iterations":2000, "cache_size":1024, "sorted_size":32}, {"tau":0.4, "block_dim":128, "max_iterations":2000, "cache_size":1024, "sorted_size":32}, {"tau":0.5, "block_dim":128, "max_iterations":2000, "cache_size":1024, "sorted_size":32} - - ], - "search_result_file" : "result/glove-100-inner/ggnn/kbuild96-segment64-refine2-k10" - }] + ] + } + ] } diff --git a/bench/ann/conf/sift-128-euclidean.json b/bench/ann/conf/sift-128-euclidean.json index 2f9956ed3d..116ea8d557 100644 --- a/bench/ann/conf/sift-128-euclidean.json +++ b/bench/ann/conf/sift-128-euclidean.json @@ -1,22 +1,24 @@ { "dataset": { "name": "sift-128-euclidean", - "base_file": "data/sift-128-euclidean/base.fbin", - "query_file": "data/sift-128-euclidean/query.fbin", + "base_file": "sift-128-euclidean/base.fbin", + "query_file": "sift-128-euclidean/query.fbin", + "groundtruth_neighbors_file": "sift-128-euclidean/groundtruth.neighbors.ibin", "distance": "euclidean" }, + "search_basic_param": { "batch_size": 5000, - "k": 10, - "run_count": 3 + "k": 10 }, + "index": [ { - "name" : "hnswlib.M12", - "algo" : "hnswlib", + "name": "hnswlib.M12", + "algo": "hnswlib", "build_param": {"M":12, "efConstruction":500, "numThreads":32}, - "file" : "index/sift-128-euclidean/hnswlib/M12", - "search_params" : [ + "file": "sift-128-euclidean/hnswlib/M12", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -27,15 +29,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/sift-128-euclidean/hnswlib/M12" + ] }, { - "name" : "hnswlib.M16", - "algo" : "hnswlib", + "name": "hnswlib.M16", + "algo": "hnswlib", "build_param": {"M":16, "efConstruction":500, "numThreads":32}, - "file" : "index/sift-128-euclidean/hnswlib/M16", - "search_params" : [ + "file": "sift-128-euclidean/hnswlib/M16", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -46,15 +47,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/sift-128-euclidean/hnswlib/M16" + ] }, { - "name" : "hnswlib.M24", - "algo" : "hnswlib", + "name": "hnswlib.M24", + "algo": "hnswlib", "build_param": {"M":24, "efConstruction":500, "numThreads":32}, - "file" : "index/sift-128-euclidean/hnswlib/M24", - "search_params" : [ + "file": "sift-128-euclidean/hnswlib/M24", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -65,15 +65,14 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/sift-128-euclidean/hnswlib/M24" + ] }, { - "name" : "hnswlib.M36", - "algo" : "hnswlib", + "name": "hnswlib.M36", + "algo": "hnswlib", "build_param": {"M":36, "efConstruction":500, "numThreads":32}, - "file" : "index/sift-128-euclidean/hnswlib/M36", - "search_params" : [ + "file": "sift-128-euclidean/hnswlib/M36", + "search_params": [ {"ef":10, "numThreads":1}, {"ef":20, "numThreads":1}, {"ef":40, "numThreads":1}, @@ -84,235 +83,109 @@ {"ef":400, "numThreads":1}, {"ef":600, "numThreads":1}, {"ef":800, "numThreads":1} - ], - "search_result_file" : "result/sift-128-euclidean/hnswlib/M36" + ] }, - - - - { "name": "raft_bfknn", "algo": "raft_bfknn", - "dataset_memtype": "device", "build_param": {}, - "file": "index/sift-128-euclidean/raft_bfknn/bfknn", - "search_params": [ - { - "probe": 1 - } - ], - "search_result_file": "result/sift-128-euclidean/raft_bfknn/bfknn" + "file": "sift-128-euclidean/raft_bfknn/bfknn", + "search_params": [{"probe": 1}] }, { "name": "faiss_ivf_flat.nlist1024", "algo": "faiss_gpu_ivf_flat", - "build_param": { - "nlist": 1024 - }, - "file": "index/sift-128-euclidean/faiss_ivf_flat/nlist1024", + "build_param": {"nlist": 1024}, + "file": "sift-128-euclidean/faiss_ivf_flat/nlist1024", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_flat/nlist1024" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_flat.nlist2048", "algo": "faiss_gpu_ivf_flat", - "build_param": { - "nlist": 2048 - }, - "file": "index/sift-128-euclidean/faiss_ivf_flat/nlist2048", + "build_param": {"nlist": 2048}, + "file": "sift-128-euclidean/faiss_ivf_flat/nlist2048", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_flat/nlist2048" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_flat.nlist4096", "algo": "faiss_gpu_ivf_flat", - "build_param": { - "nlist": 4096 - }, - "file": "index/sift-128-euclidean/faiss_ivf_flat/nlist4096", + "build_param": {"nlist": 4096}, + "file": "sift-128-euclidean/faiss_ivf_flat/nlist4096", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_flat/nlist4096" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_flat.nlist8192", "algo": "faiss_gpu_ivf_flat", - "build_param": { - "nlist": 8192 - }, - "file": "index/sift-128-euclidean/faiss_ivf_flat/nlist8192", + "build_param": {"nlist": 8192}, + "file": "sift-128-euclidean/faiss_ivf_flat/nlist8192", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_flat/nlist8192" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_flat.nlist16384", "algo": "faiss_gpu_ivf_flat", - "build_param": { - "nlist": 16384 - }, - "file": "index/sift-128-euclidean/faiss_ivf_flat/nlist16384", + "build_param": {"nlist": 16384}, + "file": "sift-128-euclidean/faiss_ivf_flat/nlist16384", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - }, - { - "nprobe": 2000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_flat/nlist16384" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000}, + {"nprobe": 2000} + ] }, { "name": "faiss_ivf_pq.M64-nlist1024", "algo": "faiss_gpu_ivf_pq", - "build_param": { - "nlist": 1024, - "M": 64, - "useFloat16": true, - "usePrecomputed": true - }, - "file": "index/sift-128-euclidean/faiss_ivf_pq/M64-nlist1024", + "build_param": {"nlist": 1024, "M": 64, "useFloat16": true, "usePrecomputed": true}, + "file": "sift-128-euclidean/faiss_ivf_pq/M64-nlist1024", "search_params": [ - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_pq/M64-nlist1024" + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_pq.M64-nlist1024.noprecomp", @@ -323,1044 +196,303 @@ "useFloat16": true, "usePrecomputed": false }, - "file": "index/sift-128-euclidean/faiss_ivf_pq/M64-nlist1024.noprecomp", + "file": "sift-128-euclidean/faiss_ivf_pq/M64-nlist1024.noprecomp", "search_params": [ - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_pq/M64-nlist1024" + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist1024-fp16", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 1024, - "quantizer_type": "fp16" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist1024-fp16", + "build_param": {"nlist": 1024, "quantizer_type": "fp16"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist1024-fp16", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist1024-fp16" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist2048-fp16", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 2048, - "quantizer_type": "fp16" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist2048-fp16", + "build_param": {"nlist": 2048, "quantizer_type": "fp16"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist2048-fp16", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist2048-fp16" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist4096-fp16", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 4096, - "quantizer_type": "fp16" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist4096-fp16", + "build_param": {"nlist": 4096, "quantizer_type": "fp16"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist4096-fp16", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist4096-fp16" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist8192-fp16", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 8192, - "quantizer_type": "fp16" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist8192-fp16", + "build_param": {"nlist": 8192, "quantizer_type": "fp16"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist8192-fp16", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist8192-fp16" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist16384-fp16", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 16384, - "quantizer_type": "fp16" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist16384-fp16", + "build_param": {"nlist": 16384, "quantizer_type": "fp16"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist16384-fp16", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - }, - { - "nprobe": 2000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist16384-fp16" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000}, + {"nprobe": 2000} + ] }, { "name": "faiss_ivf_sq.nlist1024-int8", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 1024, - "quantizer_type": "int8" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist1024-int8", + "build_param": {"nlist": 1024, "quantizer_type": "int8"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist1024-int8", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist1024-int8" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist2048-int8", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 2048, - "quantizer_type": "int8" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist2048-int8", + "build_param": {"nlist": 2048,"quantizer_type": "int8"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist2048-int8", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist2048-int8" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist4096-int8", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 4096, - "quantizer_type": "int8" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist4096-int8", + "build_param": {"nlist": 4096, "quantizer_type": "int8"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist4096-int8", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist4096-int8" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist8192-int8", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 8192, - "quantizer_type": "int8" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist8192-int8", + "build_param": {"nlist": 8192, "quantizer_type": "int8"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist8192-int8", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist8192-int8" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "faiss_ivf_sq.nlist16384-int8", "algo": "faiss_gpu_ivf_sq", - "build_param": { - "nlist": 16384, - "quantizer_type": "int8" - }, - "file": "index/sift-128-euclidean/faiss_ivf_sq/nlist16384-int8", + "build_param": {"nlist": 16384, "quantizer_type": "int8"}, + "file": "sift-128-euclidean/faiss_ivf_sq/nlist16384-int8", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - }, - { - "nprobe": 2000 - } - ], - "search_result_file": "result/sift-128-euclidean/faiss_ivf_sq/nlist16384-int8" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000}, + {"nprobe": 2000} + ] }, { "name": "faiss_flat", "algo": "faiss_gpu_flat", "build_param": {}, - "file": "index/sift-128-euclidean/faiss_flat/flat", - "search_params": [ - {} - ], - "search_result_file": "result/sift-128-euclidean/faiss_flat/flat" - }, - - { - "name": "raft_ivf_pq.dimpq128-cluster1024", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 128, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "half", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "half", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "half", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "half", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "half", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "half", - "smemLutDtype": "half" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024" + "file": "sift-128-euclidean/faiss_flat/flat", + "search_params": [{}] }, { - "name": "raft_ivf_pq.dimpq128-cluster1024-float-float", + "name": "raft_ivf_pq.dimpq64-bitpq8-cluster1K", "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 128, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-float-float", + "build_param": {"niter": 25, "nlist": 1000, "pq_dim": 64, "pq_bits": 8, "ratio": 1}, + "file": "sift-128-euclidean/raft_ivf_pq/dimpq64-bitpq8-cluster1K", "search_params": [ - { - "k": 10, - "numProbes": 1, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 1, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 5, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-float-float" - }, - { - "name": "raft_ivf_pq.dimpq128-cluster1024-float-half", + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 20, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 30, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 40, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 50, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 100, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 200, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 500, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 1000, "internalDistanceDtype": "half", "smemLutDtype": "half" } + ] + }, + { + "name": "raft_ivf_pq.dimpq128-bitpq6-cluster1K", "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 128, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-float-half", + "build_param": {"niter": 25, "nlist": 1000, "pq_dim": 128, "pq_bits": 6, "ratio": 1}, + "file": "sift-128-euclidean/raft_ivf_pq/dimpq128-bitpq6-cluster1K", "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-float-half" - }, - { - "name": "raft_ivf_pq.dimpq128-cluster1024-float-fp8", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 128, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-float-fp8", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-float-fp8" - }, - { - "name": "raft_ivf_pq.dimpq64-cluster1024-float-fp8", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 64, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq64-cluster1024-float-fp8", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq64-cluster1024-float-fp8" - }, - { - "name": "raft_ivf_pq.dimpq64-cluster1024-float-half", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 64, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq64-cluster1024-float-half", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "half" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq64-cluster1024-float-half" - }, - { - "name": "raft_ivf_pq.dimpq32-cluster1024-float-fp8", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 32, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq32-cluster1024-float-fp8", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq32-cluster1024-float-fp8" - }, - { - "name": "raft_ivf_pq.dimpq16-cluster1024-float-fp8", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 16, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq16-cluster1024-float-fp8", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "fp8" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq16-cluster1024-float-fp8" - }, - { - "name": "raft_ivf_pq.dimpq128-cluster1024-half-float", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 128, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-half-float", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "half", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "half", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "half", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "half", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "half", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "half", - "smemLutDtype": "float" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq128-cluster1024-half-float" - }, - { - "name": "raft_ivf_pq.dimpq512-cluster1024-float-float", - "algo": "raft_ivf_pq", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "pq_dim": 512, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_pq/dimpq512-cluster1024-float-float", - "search_params": [ - { - "k": 10, - "numProbes": 10, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 50, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 100, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 200, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 500, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - }, - { - "k": 10, - "numProbes": 1024, - "internalDistanceDtype": "float", - "smemLutDtype": "float" - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_pq/dimpq512-cluster1024-float-float" + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "float" }, + { "nprobe": 20, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 30, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 40, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 50, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 100, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 200, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 500, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 1000, "internalDistanceDtype": "float", "smemLutDtype": "fp8" }, + { "nprobe": 20, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 30, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 40, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 50, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 100, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 200, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 500, "internalDistanceDtype": "half", "smemLutDtype": "half" }, + { "nprobe": 1000, "internalDistanceDtype": "half", "smemLutDtype": "half" } + ] }, { "name": "raft_ivf_flat.nlist1024", "algo": "raft_ivf_flat", - "dataset_memtype": "device", - "build_param": { - "nlist": 1024, - "ratio": 1, - "niter": 25 - }, - "file": "index/sift-128-euclidean/raft_ivf_flat/nlist1024", + "build_param": {"nlist": 1024, "ratio": 1, "niter": 25}, + "file": "sift-128-euclidean/raft_ivf_flat/nlist1024", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_flat/nlist1024" + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000} + ] }, { "name": "raft_ivf_flat.nlist16384", "algo": "raft_ivf_flat", - "dataset_memtype": "device", - "build_param": { - "nlist": 16384, - "ratio": 2, - "niter": 20 - }, - "file": "index/sift-128-euclidean/raft_ivf_flat/nlist16384", + "build_param": {"nlist": 16384, "ratio": 2, "niter": 20}, + "file": "sift-128-euclidean/raft_ivf_flat/nlist16384", + "search_params": [ + {"nprobe": 1}, + {"nprobe": 5}, + {"nprobe": 10}, + {"nprobe": 50}, + {"nprobe": 100}, + {"nprobe": 200}, + {"nprobe": 500}, + {"nprobe": 1000}, + {"nprobe": 2000} + ] + }, + { + "name": "raft_cagra.dim32", + "algo": "raft_cagra", + "build_param": {"index_dim": 32}, + "file": "sift-128-euclidean/raft_cagra/dim32", "search_params": [ - { - "nprobe": 1 - }, - { - "nprobe": 5 - }, - { - "nprobe": 10 - }, - { - "nprobe": 50 - }, - { - "nprobe": 100 - }, - { - "nprobe": 200 - }, - { - "nprobe": 500 - }, - { - "nprobe": 1000 - }, - { - "nprobe": 2000 - } - ], - "search_result_file": "result/sift-128-euclidean/raft_ivf_flat/nlist16384" - }, - - { - "name" : "raft_cagra.dim32", - "algo" : "raft_cagra", - "dataset_memtype": "device", - "build_param": { - "index_dim" : 32 - }, - "file" : "index/sift-128-euclidean/raft_cagra/dim32", - "search_params" : [ {"itopk": 32}, {"itopk": 64}, {"itopk": 128} - ], - "search_result_file" : "result/sift-128-euclidean/raft_cagra/dim32" + ] }, - { - "name" : "raft_cagra.dim64", - "algo" : "raft_cagra", - "dataset_memtype": "device", - "build_param": { - "index_dim" : 64 - }, - "file" : "index/sift-128-euclidean/raft_cagra/dim64", - "search_params" : [ + "name": "raft_cagra.dim64", + "algo": "raft_cagra", + "build_param": {"index_dim": 64}, + "file": "sift-128-euclidean/raft_cagra/dim64", + "search_params": [ {"itopk": 32}, {"itopk": 64}, {"itopk": 128} - ], - "search_result_file" : "result/sift-128-euclidean/raft_cagra/dim64" + ] } ] } diff --git a/bench/ann/data_export.py b/bench/ann/data_export.py index 9410cfe773..87ca330ed9 100644 --- a/bench/ann/data_export.py +++ b/bench/ann/data_export.py @@ -16,22 +16,35 @@ import argparse import os import subprocess +import json +from pathlib import Path + +def parse_filepaths(fs): + for p in fs: + if p.endswith(".json") and os.path.exists(p): + yield p + else: + for f in Path(p).rglob('*.json'): + yield f.as_posix() def export_results(output_filepath, recompute, groundtruth_filepath, result_filepath): print(f"Writing output file to: {output_filepath}") - ann_bench_scripts_dir = os.path.join(os.getenv("RAFT_HOME"), - "cpp/bench/ann/scripts") - ann_bench_scripts_path = os.path.join(ann_bench_scripts_dir, - "eval.pl") - if recompute: - p = subprocess.Popen([ann_bench_scripts_path, "-f", "-o", output_filepath, - groundtruth_filepath, result_filepath]) - else: - p = subprocess.Popen([ann_bench_scripts_path, "-o", output_filepath, - groundtruth_filepath, result_filepath]) - p.wait() + + parsed_filepaths = parse_filepaths(result_filepaths) + + with open(output_filepath, 'w') as out: + out.write("Algo,Recall,QPS\n") + + for fp in parsed_filepaths: + with open(fp, 'r') as f: + data = json.load(f) + for benchmark_case in data["benchmarks"]: + algo = benchmark_case["name"] + recall = benchmark_case["Recall"] + qps = benchmark_case["items_per_second"] + out.write(f"{algo},{recall},{qps}\n") def main(): @@ -47,14 +60,17 @@ def main(): parser.add_argument( "--dataset-path", help="path to dataset folder", - default=os.path.join(os.getenv("RAFT_HOME"), + default=os.path.join(os.getenv("RAFT_HOME"), "bench", "ann", "data") ) - - args = parser.parse_args() - result_filepath = os.path.join(args.dataset_path, args.dataset, "result") - groundtruth_filepath = os.path.join(args.dataset_path, args.dataset, + args, result_filepaths = parser.parse_known_args() + + # if nothing is provided + if len(result_filepaths) == 0: + raise ValueError("No filepaths to results were provided") + + groundtruth_filepath = os.path.join(args.dataset_path, args.dataset, "groundtruth.neighbors.ibin") export_results(args.output, args.recompute, groundtruth_filepath, result_filepath) diff --git a/bench/ann/run.py b/bench/ann/run.py index d8e33f1113..47f4d382d4 100644 --- a/bench/ann/run.py +++ b/bench/ann/run.py @@ -48,29 +48,31 @@ def run_build_and_search(conf_filename, conf_file, executables_to_run, temp_conf = dict() temp_conf["dataset"] = conf_file["dataset"] temp_conf["search_basic_param"] = conf_file["search_basic_param"] - temp_conf["index"] = executables_to_run[(executable, + temp_conf["index"] = executables_to_run[(executable, ann_executable_path)]["index"] json.dump(temp_conf, f) if build: if force: - p = subprocess.Popen([ann_executable_path, "-b", "-f", + p = subprocess.Popen([ann_executable_path, "--build", "--overwrite", temp_conf_filepath]) p.wait() else: - p = subprocess.Popen([ann_executable_path, "-b", + p = subprocess.Popen([ann_executable_path, "--build", temp_conf_filepath]) p.wait() if search: - if force: - p = subprocess.Popen([ann_executable_path, "-s", "-f", - temp_conf_filepath]) - p.wait() - else: - p = subprocess.Popen([ann_executable_path, "-s", - temp_conf_filepath]) - p.wait() + legacy_result_folder = "result/" + temp_conf["dataset"]["name"] + os.makedirs(legacy_result_folder, exist_ok=True) + p = subprocess.Popen([ + ann_executable_path, + "--search", + "--benchmark_counters_tabular", + "--benchmark_out_format=json", + f"--benchmark_out={legacy_result_folder}/{executable}.json", + temp_conf_filepath]) + p.wait() os.remove(temp_conf_filepath) @@ -95,7 +97,7 @@ def main(): parser.add_argument( "--dataset-path", help="path to dataset folder", - default=os.path.join(os.getenv("RAFT_HOME"), + default=os.path.join(os.getenv("RAFT_HOME"), "bench", "ann", "data") ) parser.add_argument( @@ -138,7 +140,7 @@ def main(): conf_filename = conf_filepath.split("/")[-1] conf_filedir = "/".join(conf_filepath.split("/")[:-1]) dataset_name = conf_filename.replace(".json", "") - dataset_path = os.path.join(args.dataset_path, dataset_name) + dataset_path = os.path.realpath(os.path.join(args.dataset_path, dataset_name)) if not os.path.exists(conf_filepath): raise FileNotFoundError(conf_filename) @@ -146,14 +148,9 @@ def main(): conf_file = json.load(f) # Replace base, query to dataset-path - replacement_base_filepath = \ - os.path.normpath(conf_file["dataset"]["base_file"]).split(os.path.sep)[-1] - conf_file["dataset"]["base_file"] = \ - os.path.join(dataset_path, replacement_base_filepath) - replacement_query_filepath = \ - os.path.normpath(conf_file["dataset"]["query_file"]).split(os.path.sep)[-1] - conf_file["dataset"]["query_file"] = \ - os.path.join(dataset_path, replacement_query_filepath) + conf_file["dataset"]["base_file"] = os.path.join(dataset_path, "base.fbin") + conf_file["dataset"]["query_file"] = os.path.join(dataset_path, "query.fbin") + conf_file["dataset"]["groundtruth_neighbors_file"] = os.path.join(dataset_path, "groundtruth.neighbors.ibin") # Ensure base and query files exist for dataset if not os.path.exists(conf_file["dataset"]["base_file"]): raise FileNotFoundError(conf_file["dataset"]["base_file"]) diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt index ae2660509a..eb92d4e7b5 100644 --- a/cpp/CMakeLists.txt +++ b/cpp/CMakeLists.txt @@ -166,7 +166,7 @@ if(BUILD_TESTS) include(cmake/thirdparty/get_gtest.cmake) endif() -if(BUILD_PRIMS_BENCH) +if(BUILD_PRIMS_BENCH OR BUILD_ANN_BENCH) include(${rapids-cmake-dir}/cpm/gbench.cmake) rapids_cpm_gbench() endif() diff --git a/cpp/bench/ann/CMakeLists.txt b/cpp/bench/ann/CMakeLists.txt index 6977d77684..119a5c0a73 100644 --- a/cpp/bench/ann/CMakeLists.txt +++ b/cpp/bench/ann/CMakeLists.txt @@ -23,6 +23,9 @@ option(RAFT_ANN_BENCH_USE_RAFT_IVF_PQ "Include raft's ivf pq algorithm in benchm option(RAFT_ANN_BENCH_USE_RAFT_CAGRA "Include raft's CAGRA in benchmark" ON) option(RAFT_ANN_BENCH_USE_HNSWLIB "Include hnsw algorithm in benchmark" ON) option(RAFT_ANN_BENCH_USE_GGNN "Include ggnn algorithm in benchmark" ON) +option(RAFT_ANN_BENCH_SINGLE_EXE + "Make a single executable with benchmark as shared library modules" OFF +) find_package(Threads REQUIRED) @@ -77,10 +80,17 @@ function(ConfigureAnnBench) set(BENCH_NAME ${ConfigureAnnBench_NAME}_ANN_BENCH) - add_executable( - ${BENCH_NAME} ${ConfigureAnnBench_PATH} bench/ann/src/common/conf.cpp - bench/ann/src/common/util.cpp - ) + if(RAFT_ANN_BENCH_SINGLE_EXE) + add_library(${BENCH_NAME} SHARED ${ConfigureAnnBench_PATH}) + string(TOLOWER ${BENCH_NAME} BENCH_LIB_NAME) + set_target_properties(${BENCH_NAME} PROPERTIES OUTPUT_NAME ${BENCH_LIB_NAME}) + add_dependencies(${BENCH_NAME} ANN_BENCH) + else() + add_executable(${BENCH_NAME} ${ConfigureAnnBench_PATH}) + target_compile_definitions(${BENCH_NAME} PRIVATE ANN_BENCH_BUILD_MAIN) + target_link_libraries(${BENCH_NAME} PRIVATE benchmark::benchmark) + endif() + target_link_libraries( ${BENCH_NAME} PRIVATE raft::raft @@ -91,18 +101,21 @@ function(ConfigureAnnBench) ${RAFT_CTK_MATH_DEPENDENCIES} $ $ + -static-libgcc + -static-libstdc++ ) set_target_properties( ${BENCH_NAME} PROPERTIES # set target compile options - INSTALL_RPATH "\$ORIGIN/../../../lib" CXX_STANDARD 17 CXX_STANDARD_REQUIRED ON CUDA_STANDARD 17 CUDA_STANDARD_REQUIRED ON POSITION_INDEPENDENT_CODE ON INTERFACE_POSITION_INDEPENDENT_CODE ON + BUILD_RPATH "\$ORIGIN" + INSTALL_RPATH "\$ORIGIN" ) set(${ConfigureAnnBench_CXXFLAGS} ${RAFT_CXX_FLAGS} ${ConfigureAnnBench_CXXFLAGS}) @@ -200,3 +213,57 @@ if(RAFT_ANN_BENCH_USE_GGNN) ${CMAKE_CURRENT_BINARY_DIR}/_deps/ggnn-src/include LINKS glog::glog ) endif() + +# ################################################################################################## +# * Dynamically-loading ANN_BENCH executable ------------------------------------------------------- +if(RAFT_ANN_BENCH_SINGLE_EXE) + add_executable(ANN_BENCH bench/ann/src/common/benchmark.cpp) + + # Build and link static version of the GBench to keep ANN_BENCH self-contained. + get_target_property(TMP_PROP benchmark::benchmark SOURCES) + add_library(benchmark_static STATIC ${TMP_PROP}) + get_target_property(TMP_PROP benchmark::benchmark INCLUDE_DIRECTORIES) + target_include_directories(benchmark_static PUBLIC ${TMP_PROP}) + get_target_property(TMP_PROP benchmark::benchmark LINK_LIBRARIES) + target_link_libraries(benchmark_static PUBLIC ${TMP_PROP}) + + target_include_directories(ANN_BENCH PRIVATE ${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}) + + target_link_libraries( + ANN_BENCH PRIVATE nlohmann_json::nlohmann_json benchmark_static dl -static-libgcc + -static-libstdc++ CUDA::nvtx3 + ) + set_target_properties( + ANN_BENCH + PROPERTIES # set target compile options + CXX_STANDARD 17 + CXX_STANDARD_REQUIRED ON + CUDA_STANDARD 17 + CUDA_STANDARD_REQUIRED ON + POSITION_INDEPENDENT_CODE ON + INTERFACE_POSITION_INDEPENDENT_CODE ON + BUILD_RPATH "\$ORIGIN" + INSTALL_RPATH "\$ORIGIN" + ) + + # Disable NVTX when the nvtx3 headers are missing + set(_CMAKE_REQUIRED_INCLUDES_ORIG ${CMAKE_REQUIRED_INCLUDES}) + get_target_property(CMAKE_REQUIRED_INCLUDES ANN_BENCH INCLUDE_DIRECTORIES) + CHECK_INCLUDE_FILE_CXX(nvtx3/nvToolsExt.h NVTX3_HEADERS_FOUND) + set(CMAKE_REQUIRED_INCLUDES ${_CMAKE_REQUIRED_INCLUDES_ORIG}) + target_compile_definitions( + ANN_BENCH + PRIVATE + $<$:ANN_BENCH_LINK_CUDART="libcudart.so.${CUDAToolkit_VERSION_MAJOR}"> + $<$:ANN_BENCH_NVTX3_HEADERS_FOUND> + ) + + target_link_options(ANN_BENCH PRIVATE -export-dynamic) + + install( + TARGETS ANN_BENCH + COMPONENT ann_bench + DESTINATION bin/ann + EXCLUDE_FROM_ALL + ) +endif() diff --git a/cpp/bench/ann/scripts/eval.pl b/cpp/bench/ann/scripts/eval.pl deleted file mode 100755 index 81c5563d79..0000000000 --- a/cpp/bench/ann/scripts/eval.pl +++ /dev/null @@ -1,430 +0,0 @@ -#!/usr/bin/perl - -# ============================================================================= -# Copyright (c) 2020-2023, NVIDIA CORPORATION. -# -# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except -# in compliance with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software distributed under the License -# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express -# or implied. See the License for the specific language governing permissions and limitations under -# the License. - -use warnings; -use strict; -use autodie qw(open close); -use File::Find; -use Getopt::Std; - -my $QPS = 'QPS'; -my $AVG_LATENCY = 'avg_latency(ms)'; -my $P99_LATENCY = 'p99_latency(ms)'; -my $P999_LATENCY = 'p999_latency(ms)'; -my @CONDITIONS = ([$QPS, 2000], ['recall', 0.9], ['recall', 0.95]); - - -my $USAGE = << 'END'; -usage: [-f] [-l avg|p99|p999] [-o output.csv] groundtruth.neighbors.ibin result_paths... - result_paths... are paths to the search result files. - Can specify multiple paths. - For each of them, if it's a directory, all the .txt files found under - it recursively will be regarded as inputs. - - -f: force to recompute recall and update it in result file if needed - -l: output search latency rather than QPS. Available options: - "avg" for average latency; - "p99" for 99th percentile latency; - "p999" for 99.9th percentile latency. - -o: also write result to a csv file -END - - -my %opt; -getopts('fl:o:', \%opt) - or die $USAGE; -my $force_calc_recall = exists $opt{f} ? 1 : 0; -my $csv_file; -$csv_file = $opt{o} if exists $opt{o}; -my $metric = $QPS; -if (exists $opt{l}) { - my $option = $opt{l}; - if ($option eq 'avg') { - $metric = $AVG_LATENCY; - } - elsif ($option eq 'p99') { - $metric = $P99_LATENCY; - } - elsif ($option eq 'p999') { - $metric = $P999_LATENCY; - } - else { - die - "[error] illegal value for '-l': '$option'. Must be 'avg', 'p99' or 'p999'\n"; - } -} - -@ARGV >= 2 - or die $USAGE; - - -my $truth_file = shift @ARGV; -my ($k, $dataset, $distance, $results) = get_all_results($metric, @ARGV); -if (!defined $k) { - print STDERR "no result file found\n"; - exit -1; -} -print STDERR "dataset = $dataset, distance = $distance, k = $k\n\n"; -calc_missing_recall($results, $truth_file, $force_calc_recall); - -my @results = sort { - $a->{name} cmp $b->{name} - or $a->{recall} <=> $b->{recall} - or $b->{qps} <=> $a->{qps} -} @$results; -printf("%-60s %6s %16s %s\n", '', 'Recall', $metric, 'search_param'); -for my $result (@results) { - my $fmt = ($metric eq $QPS) ? '%16.1f' : '%16.3f'; - my $qps = $result->{qps}; - $qps *= 1000 if $metric ne $QPS; # the unit of latency is ms - printf("%-60s %6.4f ${fmt} %s\n", - $result->{name}, $result->{recall}, $qps, $result->{search_param}); -} -if (defined $csv_file) { - open my $fh, '>', $csv_file; - print {$fh} ",Recall,${metric},search_param\n"; - for my $result (@results) { - my $qps = $result->{qps}; - $qps *= 1000 if $metric ne $QPS; - printf {$fh} ( - "%s,%.4f,%.3f,%s\n", $result->{name}, $result->{recall}, - $qps, $result->{search_param} - ); - } -} -print "\n"; -calc_and_print_estimation($results, $metric, \@CONDITIONS); - - - - -sub read_result { - my ($fname) = @_; - open my $fh, '<', $fname; - my %attr; - while (<$fh>) { - chomp; - next if /^\s*$/; - my $pos = index($_, ':'); - $pos != -1 - or die "[error] no ':' is found: '$_'\n"; - my $key = substr($_, 0, $pos); - my $val = substr($_, $pos + 1); - $key =~ s/^\s+|\s+$//g; - $val =~ s/^\s+|\s+$//g; - - # old version benchmark compatible - if ($key eq 'search_time') { - $key = 'average_search_time'; - $val *= $attr{batch_size}; - } - $attr{$key} = $val; - } - return \%attr; -} - -sub overwrite_recall_to_result { - my ($fname, $recall) = @_; - open my $fh_in, '<', $fname; - $recall = sprintf("%f", $recall); - my $out; - while (<$fh_in>) { - s/^recall: .*/recall: $recall/; - $out .= $_; - } - close $fh_in; - - open my $fh_out, '>', $fname; - print {$fh_out} $out; -} - -sub append_recall_to_result { - my ($fname, $recall) = @_; - open my $fh, '>>', $fname; - printf {$fh} ("recall: %f\n", $recall); -} - -sub get_all_results { - my ($metric) = shift @_; - - my %fname; - my $wanted = sub { - if (-f && /\.txt$/) { - $fname{$File::Find::name} = 1; - } - }; - find($wanted, @_); - - my $k; - my $dataset; - my $distance; - my @results; - for my $f (sort keys %fname) { - print STDERR "reading $f ...\n"; - my $attr = read_result($f); - if (!defined $k) { - $k = $attr->{k}; - $dataset = $attr->{dataset}; - $distance = $attr->{distance}; - } - else { - $attr->{k} eq $k - or die "[error] k should be $k, but is $attr->{k} in $f\n"; - $attr->{dataset} eq $dataset - or die - "[error] dataset should be $dataset, but is $attr->{dataset} in $f\n"; - $attr->{distance} eq $distance - or die - "[error] distance should be $distance, but is $attr->{distance} in $f\n"; - } - - my $batch_size = $attr->{batch_size}; - $batch_size =~ s/000000$/M/; - $batch_size =~ s/000$/K/; - my $search_param = $attr->{search_param}; - $search_param =~ s/^{//; - $search_param =~ s/}$//; - $search_param =~ s/,/ /g; - $search_param =~ s/"//g; - - my $qps; - if ($metric eq $QPS) { - $qps = $attr->{batch_size} / $attr->{average_search_time}; - } - elsif ($metric eq $AVG_LATENCY) { - $qps = $attr->{average_search_time}; - } - elsif ($metric eq $P99_LATENCY) { - exists $attr->{p99_search_time} - or die "[error] p99_search_time is not found\n"; - $qps = $attr->{p99_search_time}; - } - elsif ($metric eq $P999_LATENCY) { - exists $attr->{p999_search_time} - or die "[error] p999_search_time is not found\n"; - $qps = $attr->{p999_search_time}; - } - else { - die "[error] unknown latency type: '$metric'\n"; - } - my $result = { - file => $f, - name => "$attr->{name}-batch${batch_size}", - search_param => $search_param, - qps => $qps, - }; - - if (exists $attr->{recall}) { - $result->{recall} = $attr->{recall}; - } - push @results, $result; - } - return $k, $dataset, $distance, \@results; -} - -sub read_ibin { - my ($fname) = @_; - - open my $fh, '<:raw', $fname; - my $raw; - - read($fh, $raw, 8); - my ($nrows, $dim) = unpack('LL', $raw); - - my $expected_size = 8 + $nrows * $dim * 4; - my $size = (stat($fh))[7]; - $size == $expected_size - or die( - "[error] expected size is $expected_size, but actual size is $size\n"); - - read($fh, $raw, $nrows * $dim * 4) == $nrows * $dim * 4 - or die "[error] read $fname failed\n"; - my @data = unpack('l' x ($nrows * $dim), $raw); - return \@data, $nrows, $dim; -} - -sub pick_k_neighbors { - my ($neighbors, $nrows, $ncols, $k) = @_; - - my @res; - for my $i (0 .. $nrows - 1) { - my %neighbor_set; - for my $j (0 .. $k - 1) { - $neighbor_set{$neighbors->[$i * $ncols + $j]} = 1; - } - push @res, \%neighbor_set; - } - return \@res; -} - - -sub calc_recall { - my ($truth_k_neighbors, $result_neighbors, $nrows, $k) = @_; - - my $recall = 0; - for my $i (0 .. $nrows - 1) { - my $tp = 0; - for my $j (0 .. $k - 1) { - my $neighbor = $result_neighbors->[$i * $k + $j]; - ++$tp if exists $truth_k_neighbors->[$i]{$neighbor}; - } - $recall += $tp; - } - return $recall / $k / $nrows; -} - -sub calc_missing_recall { - my ($results, $truth_file, $force_calc_recall) = @_; - - my $need_calc_recall = grep { !exists $_->{recall} } @$results; - return unless $need_calc_recall || $force_calc_recall; - - my ($truth_neighbors, $nrows, $truth_k) = read_ibin($truth_file); - $truth_k >= $k - or die "[error] ground truth k ($truth_k) < k($k)\n"; - my $truth_k_neighbors = - pick_k_neighbors($truth_neighbors, $nrows, $truth_k, $k); - - for my $result (@$results) { - next if exists $result->{recall} && !$force_calc_recall; - - my $result_bin_file = $result->{file}; - $result_bin_file =~ s/txt$/ibin/; - print STDERR "calculating recall for $result_bin_file ...\n"; - my ($result_neighbors, $result_nrows, $result_k) = - read_ibin($result_bin_file); - $result_k == $k - or die - "[error] k should be $k, but is $result_k in $result_bin_file\n"; - $result_nrows == $nrows - or die - "[error] #row should be $nrows, but is $result_nrows in $result_bin_file\n"; - - my $recall = - calc_recall($truth_k_neighbors, $result_neighbors, $nrows, $k); - if (exists $result->{recall}) { - my $new_value = sprintf("%f", $recall); - if ($result->{recall} ne $new_value) { - print "update recall: $result->{recall} -> $new_value\n"; - overwrite_recall_to_result($result->{file}, $recall); - } - } - else { - append_recall_to_result($result->{file}, $recall); - } - $result->{recall} = $recall; - } -} - - -sub estimate { - my ($results, $condition, $value) = @_; - my %point_of; - for my $result (@$results) { - my $point; - if ($condition eq 'recall') { - $point = [$result->{recall}, $result->{qps}]; - } - else { - $point = [$result->{qps}, $result->{recall}]; - } - push @{$point_of{$result->{name}}}, $point; - } - - my @names = sort keys %point_of; - my @result; - for my $name (@names) { - my @points = sort { $a->[0] <=> $b->[0] } @{$point_of{$name}}; - if ($value < $points[0][0] || $value > $points[$#points][0]) { - push @result, -1; - next; - } - elsif ($value == $points[0][0]) { - push @result, $points[0][1]; - next; - } - - for my $i (1 .. $#points) { - if ($points[$i][0] >= $value) { - push @result, - linear_interpolation($value, @{$points[$i - 1]}, - @{$points[$i]}); - last; - } - } - } - return \@names, \@result; -} - -sub linear_interpolation { - my ($x, $x1, $y1, $x2, $y2) = @_; - return $y1 + ($x - $x1) * ($y2 - $y1) / ($x2 - $x1); -} - -sub merge { - my ($all, $new, $scale) = @_; - @$all == @$new - or die "[error] length is not equal\n"; - for my $i (0 .. @$all - 1) { - push @{$all->[$i]}, $new->[$i] * $scale; - } -} - -sub calc_and_print_estimation { - my ($results, $metric, $conditions) = @_; - - my @conditions = grep { - my $target = $_->[0]; - if ($target eq 'recall' || $target eq $metric) { - 1; - } - else { - $target eq $QPS - || $target eq $AVG_LATENCY - || $target eq $P99_LATENCY - || $target eq $P999_LATENCY - or die "[error] unknown condition: '$target'\n"; - 0; - } - } @$conditions; - - my @headers = map { - my $header; - if ($_->[0] eq 'recall') { - $header = $metric . '@recall' . $_->[1]; - } - elsif ($_->[0] eq $metric) { - $header = 'recall@' . $metric . $_->[1]; - } - $header; - } @conditions; - - my $scale = ($metric eq $QPS) ? 1 : 1000; - my $estimations; - for my $condition (@conditions) { - my ($names, $estimate) = estimate($results, @$condition); - if (!defined $estimations) { - @$estimations = map { [$_] } @$names; - } - merge($estimations, $estimate, $scale); - } - - my $fmt = "%-60s" . (" %16s" x @headers) . "\n"; - printf($fmt, '', @headers); - $fmt =~ s/16s/16.4f/g; - for (@$estimations) { - printf($fmt, @$_); - } -} diff --git a/cpp/bench/ann/src/common/ann_types.hpp b/cpp/bench/ann/src/common/ann_types.hpp index 93892de9df..e0c22d1798 100644 --- a/cpp/bench/ann/src/common/ann_types.hpp +++ b/cpp/bench/ann/src/common/ann_types.hpp @@ -1,3 +1,5 @@ + + /* * Copyright (c) 2023, NVIDIA CORPORATION. * @@ -15,40 +17,77 @@ */ #pragma once + +#include #include #include -#include +#ifndef CPU_ONLY +#include // cudaStream_t +#endif namespace raft::bench::ann { -enum class Metric { - kInnerProduct, - kEuclidean, -}; - enum class MemoryType { Host, HostMmap, Device, }; +enum class Metric { + kInnerProduct, + kEuclidean, +}; + +inline auto parse_metric(const std::string& metric_str) -> Metric +{ + if (metric_str == "inner_product") { + return raft::bench::ann::Metric::kInnerProduct; + } else if (metric_str == "euclidean") { + return raft::bench::ann::Metric::kEuclidean; + } else { + throw std::runtime_error("invalid metric: '" + metric_str + "'"); + } +} + +inline auto parse_memory_type(const std::string& memory_type) -> MemoryType +{ + if (memory_type == "host") { + return MemoryType::Host; + } else if (memory_type == "mmap") { + return MemoryType::HostMmap; + } else if (memory_type == "device") { + return MemoryType::Device; + } else { + throw std::runtime_error("invalid memory type: '" + memory_type + "'"); + } +} + struct AlgoProperty { MemoryType dataset_memory_type; // neighbors/distances should have same memory type as queries MemoryType query_memory_type; - bool need_dataset_when_search; +}; + +class AnnBase { + public: + inline AnnBase(Metric metric, int dim) : metric_(metric), dim_(dim) {} + virtual ~AnnBase() = default; + + protected: + Metric metric_; + int dim_; }; template -class ANN { +class ANN : public AnnBase { public: struct AnnSearchParam { virtual ~AnnSearchParam() = default; + [[nodiscard]] virtual auto needs_dataset() const -> bool { return false; }; }; - ANN(Metric metric, int dim) : metric_(metric), dim_(dim) {} - virtual ~ANN() = default; + inline ANN(Metric metric, int dim) : AnnBase(metric, dim) {} virtual void build(const T* dataset, size_t nrow, cudaStream_t stream = 0) = 0; @@ -65,7 +104,7 @@ class ANN { virtual void save(const std::string& file) const = 0; virtual void load(const std::string& file) = 0; - virtual AlgoProperty get_property() const = 0; + virtual AlgoProperty get_preference() const = 0; // Some algorithms don't save the building dataset in their indices. // So they should be given the access to that dataset during searching. @@ -77,10 +116,14 @@ class ANN { // The client code should call set_search_dataset() before searching, // and should not release dataset before searching is finished. virtual void set_search_dataset(const T* /*dataset*/, size_t /*nrow*/){}; - - protected: - Metric metric_; - int dim_; }; } // namespace raft::bench::ann + +#define REGISTER_ALGO_INSTANCE(DataT) \ + template auto raft::bench::ann::create_algo( \ + const std::string&, const std::string&, int, const nlohmann::json&, const std::vector&) \ + ->std::unique_ptr>; \ + template auto raft::bench::ann::create_search_param(const std::string&, \ + const nlohmann::json&) \ + ->std::unique_ptr::AnnSearchParam>; diff --git a/cpp/bench/ann/src/common/benchmark.cpp b/cpp/bench/ann/src/common/benchmark.cpp new file mode 100644 index 0000000000..6424a36471 --- /dev/null +++ b/cpp/bench/ann/src/common/benchmark.cpp @@ -0,0 +1,109 @@ +/* + * Copyright (c) 2023, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include "cuda_stub.hpp" // must go first + +#include "ann_types.hpp" + +#define JSON_DIAGNOSTICS 1 +#include + +#include +#include + +#include +#include + +namespace raft::bench::ann { + +struct lib_handle { + void* handle{nullptr}; + explicit lib_handle(const std::string& name) + { + handle = dlopen(name.c_str(), RTLD_LAZY | RTLD_LOCAL); + if (handle == nullptr) { + auto error_msg = "Failed to load " + name; + auto err = dlerror(); + if (err != nullptr && err[0] != '\0') { error_msg += ": " + std::string(err); } + throw std::runtime_error(error_msg); + } + } + ~lib_handle() noexcept + { + if (handle != nullptr) { dlclose(handle); } + } +}; + +auto load_lib(const std::string& algo) -> void* +{ + static std::unordered_map libs{}; + auto found = libs.find(algo); + + if (found != libs.end()) { return found->second.handle; } + auto lib_name = "lib" + algo + "_ann_bench.so"; + return libs.emplace(algo, lib_name).first->second.handle; +} + +auto get_fun_name(void* addr) -> std::string +{ + Dl_info dl_info; + if (dladdr(addr, &dl_info) != 0) { + if (dl_info.dli_sname != nullptr && dl_info.dli_sname[0] != '\0') { + return std::string{dl_info.dli_sname}; + } + } + throw std::logic_error("Failed to find out name of the looked up function"); +} + +template +auto create_algo(const std::string& algo, + const std::string& distance, + int dim, + const nlohmann::json& conf, + const std::vector& dev_list) -> std::unique_ptr> +{ + static auto fname = get_fun_name(reinterpret_cast(&create_algo)); + auto handle = load_lib(algo); + auto fun_addr = dlsym(handle, fname.c_str()); + if (fun_addr == nullptr) { + throw std::runtime_error("Couldn't load the create_algo function (" + algo + ")"); + } + auto fun = reinterpret_cast)>(fun_addr); + return fun(algo, distance, dim, conf, dev_list); +} + +template +std::unique_ptr::AnnSearchParam> create_search_param( + const std::string& algo, const nlohmann::json& conf) +{ + static auto fname = get_fun_name(reinterpret_cast(&create_search_param)); + auto handle = load_lib(algo); + auto fun_addr = dlsym(handle, fname.c_str()); + if (fun_addr == nullptr) { + throw std::runtime_error("Couldn't load the create_search_param function (" + algo + ")"); + } + auto fun = reinterpret_cast)>(fun_addr); + return fun(algo, conf); +} + +}; // namespace raft::bench::ann + +REGISTER_ALGO_INSTANCE(float); +REGISTER_ALGO_INSTANCE(std::int8_t); +REGISTER_ALGO_INSTANCE(std::uint8_t); + +#include "benchmark.hpp" + +int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } diff --git a/cpp/bench/ann/src/common/benchmark.hpp b/cpp/bench/ann/src/common/benchmark.hpp index 971e6a3fd3..5ce453a116 100644 --- a/cpp/bench/ann/src/common/benchmark.hpp +++ b/cpp/bench/ann/src/common/benchmark.hpp @@ -13,598 +13,526 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -#ifdef NVTX -#include -#endif -#include +#pragma once + +#include "ann_types.hpp" +#include "conf.hpp" +#include "dataset.hpp" +#include "util.hpp" + +#include #include #include #include #include -#include #include #include #include #include -#include +#include #include -#include +#ifdef ANN_BENCH_BUILD_MAIN +#ifdef CPU_ONLY +#define CUDART_FOUND false +#else +#define CUDART_FOUND true +#endif +#else +#define CUDART_FOUND (cudart.found()) +#endif -#include "benchmark_util.hpp" -#include "conf.h" -#include "dataset.h" -#include "util.h" +namespace raft::bench::ann { -using std::cerr; -using std::cout; -using std::endl; -using std::string; -using std::to_string; -using std::unordered_set; -using std::vector; +static inline std::unique_ptr current_algo{nullptr}; -namespace raft::bench::ann { +using kv_series = std::vector>>; -inline bool check_file_exist(const std::vector& files) +inline auto apply_overrides(const std::vector& configs, + const kv_series& overrides, + std::size_t override_idx = 0) -> std::vector { - bool ret = true; - std::unordered_set processed; - for (const auto& file : files) { - if (processed.find(file) == processed.end() && !file_exists(file)) { - log_error("file '%s' doesn't exist or is not a regular file", file.c_str()); - ret = false; + std::vector results{}; + if (override_idx >= overrides.size()) { + auto n = configs.size(); + for (size_t i = 0; i < n; i++) { + auto c = configs[i]; + c["override_suffix"] = n > 1 ? "/" + std::to_string(i) : ""; + results.push_back(c); } - processed.insert(file); + return results; } - return ret; -} - -inline bool check_file_not_exist(const std::vector& files, bool force_overwrite) -{ - bool ret = true; - for (const auto& file : files) { - if (file_exists(file)) { - if (force_overwrite) { - log_warn("'%s' already exists, will overwrite it", file.c_str()); - } else { - log_error("'%s' already exists, use '-f' to force overwriting", file.c_str()); - ret = false; + auto rec_configs = apply_overrides(configs, overrides, override_idx + 1); + auto [key, vals] = overrides[override_idx]; + auto n = vals.size(); + for (size_t i = 0; i < n; i++) { + const auto& val = vals[i]; + for (auto rc : rec_configs) { + if (n > 1) { + rc["override_suffix"] = + static_cast(rc["override_suffix"]) + "/" + std::to_string(i); } + rc[key] = val; + results.push_back(rc); } } - return ret; + return results; } -inline bool check_no_duplicate_file(const std::vector& files) +inline auto apply_overrides(const nlohmann::json& config, + const kv_series& overrides, + std::size_t override_idx = 0) { - bool ret = true; - std::unordered_set processed; - for (const auto& file : files) { - if (processed.find(file) != processed.end()) { - log_error("'%s' occurs more than once as output file, would be overwritten", file.c_str()); - ret = false; - } - processed.insert(file); - } - return ret; + return apply_overrides(std::vector{config}, overrides, 0); } -inline bool mkdir(const std::vector& dirs) +inline void dump_parameters(::benchmark::State& state, nlohmann::json params) { - std::unordered_set processed; - for (const auto& dir : dirs) { - if (processed.find(dir) == processed.end() && !dir_exists(dir)) { - if (create_dir(dir)) { - log_info("mkdir '%s'", dir.c_str()); + std::string label = ""; + bool label_empty = true; + for (auto& [key, val] : params.items()) { + if (val.is_number()) { + state.counters.insert({{key, val}}); + } else if (val.is_boolean()) { + state.counters.insert({{key, val ? 1.0 : 0.0}}); + } else { + auto kv = key + "=" + val.dump(); + if (label_empty) { + label = kv; } else { - log_error("fail to create output directory '%s'", dir.c_str()); - // won't create any other dir when problem occurs - return false; + label += "#" + kv; } + label_empty = false; } - processed.insert(dir); } - return true; + if (!label_empty) { state.SetLabel(label); } } -inline bool check(const std::vector& indices, - const bool build_mode, - const bool force_overwrite) +inline auto parse_algo_property(AlgoProperty prop, const nlohmann::json& conf) -> AlgoProperty { - std::vector files_should_exist; - std::vector dirs_should_exist; - std::vector output_files; - for (const auto& index : indices) { - if (build_mode) { - output_files.push_back(index.file); - output_files.push_back(index.file + ".txt"); - - const auto pos = index.file.rfind('/'); - if (pos != std::string::npos) { dirs_should_exist.push_back(index.file.substr(0, pos)); } - } else { - files_should_exist.push_back(index.file); - files_should_exist.push_back(index.file + ".txt"); - - output_files.push_back(index.search_result_file + ".0.ibin"); - output_files.push_back(index.search_result_file + ".0.txt"); - - const auto pos = index.search_result_file.rfind('/'); - if (pos != std::string::npos) { - dirs_should_exist.push_back(index.search_result_file.substr(0, pos)); - } - } + if (conf.contains("dataset_memory_type")) { + prop.dataset_memory_type = parse_memory_type(conf.at("dataset_memory_type")); } - - bool ret = true; - if (!check_file_exist(files_should_exist)) { ret = false; } - if (!check_file_not_exist(output_files, force_overwrite)) { ret = false; } - if (!check_no_duplicate_file(output_files)) { ret = false; } - if (ret && !mkdir(dirs_should_exist)) { ret = false; } - return ret; -} - -inline void write_build_info(const std::string& file_prefix, - const std::string& dataset, - const std::string& distance, - const std::string& name, - const std::string& algo, - const std::string& build_param, - const float build_time) -{ - std::ofstream ofs(file_prefix + ".txt"); - if (!ofs) { throw std::runtime_error("can't open build info file: " + file_prefix + ".txt"); } - ofs << "dataset: " << dataset << "\n" - << "distance: " << distance << "\n" - << "\n" - << "name: " << name << "\n" - << "algo: " << algo << "\n" - << "build_param: " << build_param << "\n" - << "build_time: " << build_time << endl; - ofs.close(); - if (!ofs) { throw std::runtime_error("can't write to build info file: " + file_prefix + ".txt"); } -} + if (conf.contains("query_memory_type")) { + prop.query_memory_type = parse_memory_type(conf.at("query_memory_type")); + } + return prop; +}; template -void build(const Dataset* dataset, const std::vector& indices) +void bench_build(::benchmark::State& state, + std::shared_ptr> dataset, + Configuration::Index index, + bool force_overwrite) { - cudaStream_t stream; - RAFT_CUDA_TRY(cudaStreamCreate(&stream)); - - log_info( - "base set from dataset '%s', #vector = %zu", dataset->name().c_str(), dataset->base_set_size()); - - for (const auto& index : indices) { - log_info("creating algo '%s', param=%s", index.algo.c_str(), index.build_param.dump().c_str()); - const auto algo = create_algo(index.algo, - dataset->distance(), - dataset->dim(), - index.refine_ratio, - index.build_param, - index.dev_list, - index.index_conf); - const auto algo_property = algo->get_property(); - - const T* base_set_ptr = nullptr; - if (algo_property.dataset_memory_type == MemoryType::Host) { - log_info("%s", "loading base set to memory"); - base_set_ptr = dataset->base_set(); - } else if (algo_property.dataset_memory_type == MemoryType::HostMmap) { - log_info("%s", "mapping base set to memory"); - base_set_ptr = dataset->mapped_base_set(); - } else if (algo_property.dataset_memory_type == MemoryType::Device) { - log_info("%s", "loading base set to GPU"); - base_set_ptr = dataset->base_set_on_gpu(); + dump_parameters(state, index.build_param); + if (file_exists(index.file)) { + if (force_overwrite) { + log_info("Overwriting file: %s", index.file.c_str()); + } else { + return state.SkipWithMessage( + "Index file already exists (use --overwrite to overwrite the index)."); } - - log_info("building index '%s'", index.name.c_str()); - RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); -#ifdef NVTX - nvtxRangePush("build"); -#endif - Timer timer; - algo->build(base_set_ptr, dataset->base_set_size(), stream); - RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); - const float elapsed_ms = timer.elapsed_ms(); -#ifdef NVTX - nvtxRangePop(); -#endif - log_info("built index in %.2f seconds", elapsed_ms / 1000.0f); - RAFT_CUDA_TRY(cudaDeviceSynchronize()); - RAFT_CUDA_TRY(cudaPeekAtLastError()); - - algo->save(index.file); - write_build_info(index.file, - dataset->name(), - dataset->distance(), - index.name, - index.algo, - index.build_param.dump(), - elapsed_ms / 1000.0f); - log_info("saved index to %s", index.file.c_str()); } - RAFT_CUDA_TRY(cudaStreamDestroy(stream)); -} - -inline void write_search_result(const std::string& file_prefix, - const std::string& dataset, - const std::string& distance, - const std::string& name, - const std::string& algo, - const std::string& build_param, - const std::string& search_param, - std::size_t batch_size, - unsigned run_count, - unsigned k, - float search_time_average, - float search_time_p99, - float search_time_p999, - float query_per_second, - const int* neighbors, - size_t query_set_size) -{ - log_info("throughput : %e [QPS]", query_per_second); - std::ofstream ofs(file_prefix + ".txt"); - if (!ofs) { throw std::runtime_error("can't open search result file: " + file_prefix + ".txt"); } - ofs << "dataset: " << dataset << "\n" - << "distance: " << distance << "\n" - << "\n" - << "name: " << name << "\n" - << "algo: " << algo << "\n" - << "build_param: " << build_param << "\n" - << "search_param: " << search_param << "\n" - << "\n" - << "batch_size: " << batch_size << "\n" - << "run_count: " << run_count << "\n" - << "k: " << k << "\n" - << "query_per_second: " << query_per_second << "\n" - << "average_search_time: " << search_time_average << endl; - - if (search_time_p99 != std::numeric_limits::max()) { - ofs << "p99_search_time: " << search_time_p99 << endl; - } - if (search_time_p999 != std::numeric_limits::max()) { - ofs << "p999_search_time: " << search_time_p999 << endl; + std::unique_ptr> algo; + try { + algo = ann::create_algo( + index.algo, dataset->distance(), dataset->dim(), index.build_param, index.dev_list); + } catch (const std::exception& e) { + return state.SkipWithError("Failed to create an algo: " + std::string(e.what())); } - ofs.close(); - if (!ofs) { - throw std::runtime_error("can't write to search result file: " + file_prefix + ".txt"); + const auto algo_property = parse_algo_property(algo->get_preference(), index.build_param); + + const T* base_set = dataset->base_set(algo_property.dataset_memory_type); + std::size_t index_size = dataset->base_set_size(); + + cuda_timer gpu_timer; + { + nvtx_case nvtx{state.name()}; + for (auto _ : state) { + [[maybe_unused]] auto ntx_lap = nvtx.lap(); + [[maybe_unused]] auto gpu_lap = gpu_timer.lap(); + try { + algo->build(base_set, index_size, gpu_timer.stream()); + } catch (const std::exception& e) { + state.SkipWithError(std::string(e.what())); + } + } } + state.counters.insert( + {{"GPU Time", gpu_timer.total_time() / state.iterations()}, {"index_size", index_size}}); - BinFile neighbors_file(file_prefix + ".ibin", "w"); - neighbors_file.write(neighbors, query_set_size, k); + if (state.skipped()) { return; } + make_sure_parent_dir_exists(index.file); + algo->save(index.file); } template -inline void search(const Dataset* dataset, const std::vector& indices) +void bench_search(::benchmark::State& state, + std::shared_ptr> dataset, + Configuration::Index index, + std::size_t search_param_ix) { - if (indices.empty()) { return; } - cudaStream_t stream; - RAFT_CUDA_TRY(cudaStreamCreate(&stream)); - - log_info("loading query set from dataset '%s', #vector = %zu", - dataset->name().c_str(), - dataset->query_set_size()); - const T* const query_set = dataset->query_set(); - // query set is usually much smaller than base set, so load it eagerly - const T* const d_query_set = dataset->query_set_on_gpu(); - const size_t query_set_size = dataset->query_set_size(); - - // currently all indices has same batch_size, k and run_count - const std::size_t batch_size = indices[0].batch_size; - const unsigned k = indices[0].k; - const unsigned run_count = indices[0].run_count; - log_info( - "basic search parameters: batch_size = %d, k = %d, run_count = %d", batch_size, k, run_count); - if (query_set_size % batch_size != 0) { - log_warn("query set size (%zu) % batch size (%d) != 0, the size of last batch is %zu", - query_set_size, - batch_size, - query_set_size % batch_size); + const auto& sp_json = index.search_params[search_param_ix]; + dump_parameters(state, sp_json); + + // NB: `k` and `n_queries` are guaranteed to be populated in conf.cpp + const std::uint32_t k = sp_json["k"]; + // Amount of data processes in one go + const std::size_t n_queries = sp_json["n_queries"]; + // Round down the query data to a multiple of the batch size to loop over full batches of data + const std::size_t query_set_size = (dataset->query_set_size() / n_queries) * n_queries; + + if (!file_exists(index.file)) { + state.SkipWithError("Index file is missing. Run the benchmark in the build mode first."); + return; + } + // algo is static to cache it between close search runs to save time on index loading + static std::string index_file = ""; + if (index.file != index_file) { + current_algo.reset(); + index_file = index.file; } - const std::size_t num_batches = (query_set_size - 1) / batch_size + 1; - std::size_t* const neighbors = new std::size_t[query_set_size * k]; - int* const neighbors_buf = new int[query_set_size * k]; - float* const distances = new float[query_set_size * k]; - std::vector search_times; - search_times.reserve(num_batches); - std::size_t* d_neighbors; - float* d_distances; - RAFT_CUDA_TRY(cudaMalloc((void**)&d_neighbors, query_set_size * k * sizeof(*d_neighbors))); - RAFT_CUDA_TRY(cudaMalloc((void**)&d_distances, query_set_size * k * sizeof(*d_distances))); - - for (const auto& index : indices) { - log_info("creating algo '%s', param=%s", index.algo.c_str(), index.build_param.dump().c_str()); - const auto algo = create_algo(index.algo, - dataset->distance(), - dataset->dim(), - index.refine_ratio, - index.build_param, - index.dev_list, - index.index_conf); - const auto algo_property = algo->get_property(); - - log_info("loading index '%s' from file '%s'", index.name.c_str(), index.file.c_str()); - algo->load(index.file); - - const T* this_query_set = query_set; - std::size_t* this_neighbors = neighbors; - float* this_distances = distances; - if (algo_property.query_memory_type == MemoryType::Device) { - this_query_set = d_query_set; - this_neighbors = d_neighbors; - this_distances = d_distances; + ANN* algo; + std::unique_ptr::AnnSearchParam> search_param; + try { + if (!current_algo || (algo = dynamic_cast*>(current_algo.get())) == nullptr) { + auto ualgo = ann::create_algo( + index.algo, dataset->distance(), dataset->dim(), index.build_param, index.dev_list); + algo = ualgo.get(); + algo->load(index_file); + current_algo = std::move(ualgo); } - - if (algo_property.need_dataset_when_search) { - log_info("loading base set from dataset '%s', #vector = %zu", - dataset->name().c_str(), - dataset->base_set_size()); - const T* base_set_ptr = nullptr; - if (algo_property.dataset_memory_type == MemoryType::Host) { - log_info("%s", "loading base set to memory"); - base_set_ptr = dataset->base_set(); - } else if (algo_property.dataset_memory_type == MemoryType::HostMmap) { - log_info("%s", "mapping base set to memory"); - base_set_ptr = dataset->mapped_base_set(); - } else if (algo_property.dataset_memory_type == MemoryType::Device) { - log_info("%s", "loading base set to GPU"); - base_set_ptr = dataset->base_set_on_gpu(); - } - algo->set_search_dataset(base_set_ptr, dataset->base_set_size()); + search_param = ann::create_search_param(index.algo, sp_json); + } catch (const std::exception& e) { + return state.SkipWithError("Failed to create an algo: " + std::string(e.what())); + } + algo->set_search_param(*search_param); + + const auto algo_property = parse_algo_property(algo->get_preference(), sp_json); + const T* query_set = dataset->query_set(algo_property.query_memory_type); + buf distances{algo_property.query_memory_type, k * query_set_size}; + buf neighbors{algo_property.query_memory_type, k * query_set_size}; + + if (search_param->needs_dataset()) { + try { + algo->set_search_dataset(dataset->base_set(algo_property.dataset_memory_type), + dataset->base_set_size()); + } catch (const std::exception&) { + state.SkipWithError("The algorithm '" + index.name + + "' requires the base set, but it's not available."); + return; } + } - for (int i = 0, end_i = index.search_params.size(); i != end_i; ++i) { - const auto p_param = create_search_param(index.algo, index.search_params[i]); - algo->set_search_param(*p_param); - log_info("search with param: %s", index.search_params[i].dump().c_str()); - - if (algo_property.query_memory_type == MemoryType::Device) { - RAFT_CUDA_TRY(cudaMemset(d_neighbors, 0, query_set_size * k * sizeof(*d_neighbors))); - RAFT_CUDA_TRY(cudaMemset(d_distances, 0, query_set_size * k * sizeof(*d_distances))); - } else { - memset(neighbors, 0, query_set_size * k * sizeof(*neighbors)); - memset(distances, 0, query_set_size * k * sizeof(*distances)); + std::ptrdiff_t batch_offset = 0; + std::size_t queries_processed = 0; + cuda_timer gpu_timer; + { + nvtx_case nvtx{state.name()}; + for (auto _ : state) { + // measure the GPU time using the RAII helper + [[maybe_unused]] auto ntx_lap = nvtx.lap(); + [[maybe_unused]] auto gpu_lap = gpu_timer.lap(); + // run the search + try { + algo->search(query_set + batch_offset * dataset->dim(), + n_queries, + k, + neighbors.data + batch_offset * k, + distances.data + batch_offset * k, + gpu_timer.stream()); + } catch (const std::exception& e) { + state.SkipWithError(std::string(e.what())); } - - float best_search_time_average = std::numeric_limits::max(); - float best_search_time_p99 = std::numeric_limits::max(); - float best_search_time_p999 = std::numeric_limits::max(); - float total_search_time = 0; - for (unsigned run = 0; run < run_count; ++run) { - log_info("run %d / %d", run + 1, run_count); - for (std::size_t batch_id = 0; batch_id < num_batches; ++batch_id) { - const std::size_t row = batch_id * batch_size; - const std::size_t actual_batch_size = - (batch_id == num_batches - 1) ? query_set_size - row : batch_size; - RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); -#ifdef NVTX - string nvtx_label = "batch" + to_string(batch_id); - if (run_count != 1) { nvtx_label = "run" + to_string(run) + "-" + nvtx_label; } - if (batch_id == 10) { - run = run_count - 1; + // advance to the next batch + batch_offset = (batch_offset + n_queries) % query_set_size; + queries_processed += n_queries; + } + } + state.SetItemsProcessed(queries_processed); + state.counters.insert({{"k", k}, {"n_queries", n_queries}}); + if (CUDART_FOUND) { + state.counters.insert({{"GPU Time", gpu_timer.total_time() / state.iterations()}, + {"GPU QPS", queries_processed / gpu_timer.total_time()}}); + } + if (state.skipped()) { return; } + + // evaluate recall + if (dataset->max_k() >= k) { + const std::int32_t* gt = dataset->gt_set(); + const std::uint32_t max_k = dataset->max_k(); + buf neighbors_host = neighbors.move(MemoryType::Host); + + std::size_t rows = std::min(queries_processed, query_set_size); + std::size_t match_count = 0; + std::size_t total_count = rows * static_cast(k); + for (std::size_t i = 0; i < rows; i++) { + for (std::uint32_t j = 0; j < k; j++) { + auto act_idx = std::int32_t(neighbors_host.data[i * k + j]); + for (std::uint32_t l = 0; l < k; l++) { + auto exp_idx = gt[i * max_k + l]; + if (act_idx == exp_idx) { + match_count++; break; } -#endif - Timer timer; -#ifdef NVTX - nvtxRangePush(nvtx_label.c_str()); -#endif - algo->search(this_query_set + row * dataset->dim(), - actual_batch_size, - k, - this_neighbors + row * k, - this_distances + row * k, - stream); - RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); - const float elapsed_ms = timer.elapsed_ms(); -#ifdef NVTX - nvtxRangePop(); -#endif - // If the size of the last batch is less than batch_size, don't count it for - // search time. But neighbors of the last batch will still be filled, so it's - // counted for recall calculation. - if (actual_batch_size == batch_size) { - search_times.push_back(elapsed_ms / 1000.0f); // in seconds - } - } - - const float total_search_time_run = - std::accumulate(search_times.cbegin(), search_times.cend(), 0.0f); - const float search_time_average = total_search_time_run / search_times.size(); - total_search_time += total_search_time_run; - best_search_time_average = std::min(best_search_time_average, search_time_average); - - if (search_times.size() >= 100) { - std::sort(search_times.begin(), search_times.end()); - - const auto calc_percentile_pos = [](float percentile, size_t N) { - return static_cast(std::ceil(percentile / 100.0 * N)) - 1; - }; - - const float search_time_p99 = search_times[calc_percentile_pos(99, search_times.size())]; - best_search_time_p99 = std::min(best_search_time_p99, search_time_p99); - - if (search_times.size() >= 1000) { - const float search_time_p999 = - search_times[calc_percentile_pos(99.9, search_times.size())]; - best_search_time_p999 = std::min(best_search_time_p999, search_time_p999); - } } - search_times.clear(); } - RAFT_CUDA_TRY(cudaDeviceSynchronize()); - RAFT_CUDA_TRY(cudaPeekAtLastError()); - const auto query_per_second = - (run_count * raft::round_down_safe(query_set_size, batch_size)) / total_search_time; - - if (algo_property.query_memory_type == MemoryType::Device) { - RAFT_CUDA_TRY(cudaMemcpy(neighbors, - d_neighbors, - query_set_size * k * sizeof(*d_neighbors), - cudaMemcpyDeviceToHost)); - RAFT_CUDA_TRY(cudaMemcpy(distances, - d_distances, - query_set_size * k * sizeof(*d_distances), - cudaMemcpyDeviceToHost)); - } - - for (std::size_t j = 0; j < query_set_size * k; ++j) { - neighbors_buf[j] = neighbors[j]; - } - write_search_result(index.search_result_file + "." + to_string(i), - dataset->name(), - dataset->distance(), - index.name, - index.algo, - index.build_param.dump(), - index.search_params[i].dump(), - batch_size, - index.run_count, - k, - best_search_time_average, - best_search_time_p99, - best_search_time_p999, - query_per_second, - neighbors_buf, - query_set_size); } - - log_info("finish searching for index '%s'", index.name.c_str()); + double actual_recall = static_cast(match_count) / static_cast(total_count); + state.counters.insert({{"Recall", actual_recall}}); } +} - delete[] neighbors; - delete[] neighbors_buf; - delete[] distances; - RAFT_CUDA_TRY(cudaFree(d_neighbors)); - RAFT_CUDA_TRY(cudaFree(d_distances)); - RAFT_CUDA_TRY(cudaStreamDestroy(stream)); +inline void printf_usage() +{ + ::benchmark::PrintDefaultHelp(); + fprintf( + stdout, + " [--build|--search] \n" + " [--overwrite]\n" + " [--data_prefix=]\n" + " [--index_prefix=]\n" + " [--override_kv=]\n" + " .json\n" + "\n" + "Note the non-standard benchmark parameters:\n" + " --build: build mode, will build index\n" + " --search: search mode, will search using the built index\n" + " one and only one of --build and --search should be specified\n" + " --overwrite: force overwriting existing index files\n" + " --data_prefix=:" + " prepend to dataset file paths specified in the .json (default = 'data/').\n" + " --index_prefix=:" + " prepend to index file paths specified in the .json (default = 'index/').\n" + " --override_kv=:" + " override a build/search key one or more times multiplying the number of configurations;" + " you can use this parameter multiple times to get the Cartesian product of benchmark" + " configs.\n"); } -inline const std::string usage(const string& argv0) +template +void register_build(std::shared_ptr> dataset, + std::vector indices, + bool force_overwrite) { - return "usage: " + argv0 + " -b|s [-c] [-f] [-i index_names] conf.json\n" + - " -b: build mode, will build index\n" + - " -s: search mode, will search using built index\n" + - " one and only one of -b and -s should be specified\n" + - " -c: just check command line options and conf.json are sensible\n" + - " won't build or search\n" + " -f: force overwriting existing output files\n" + - " -i: by default will build/search all the indices found in conf.json\n" + - " '-i' can be used to select a subset of indices\n" + - " 'index_names' is a list of comma-separated index names\n" + - " '*' is allowed as the last character of a name to select all matched indices\n" + - " for example, -i \"hnsw1,hnsw2,faiss\" or -i \"hnsw*,faiss\""; + for (auto index : indices) { + auto suf = static_cast(index.build_param["override_suffix"]); + auto file_suf = suf; + index.build_param.erase("override_suffix"); + std::replace(file_suf.begin(), file_suf.end(), '/', '-'); + index.file += file_suf; + auto* b = ::benchmark::RegisterBenchmark( + index.name + suf, bench_build, dataset, index, force_overwrite); + b->Unit(benchmark::kSecond); + b->UseRealTime(); + } } template -inline int dispatch_benchmark(const Configuration& conf, - const std::string& index_patterns, - bool force_overwrite, - bool only_check, - bool build_mode, - bool search_mode) +void register_search(std::shared_ptr> dataset, + std::vector indices) { - try { - const auto dataset_conf = conf.get_dataset_conf(); - - BinDataset dataset(dataset_conf.name, - dataset_conf.base_file, - dataset_conf.subset_first_row, - dataset_conf.subset_size, - dataset_conf.query_file, - dataset_conf.distance); - - vector indices = conf.get_indices(index_patterns); - if (!check(indices, build_mode, force_overwrite)) { return -1; } - - std::string message = "will "; - message += build_mode ? "build:" : "search:"; - for (const auto& index : indices) { - message += "\n " + index.name; + for (auto index : indices) { + for (std::size_t i = 0; i < index.search_params.size(); i++) { + auto suf = static_cast(index.search_params[i]["override_suffix"]); + index.search_params[i].erase("override_suffix"); + auto* b = + ::benchmark::RegisterBenchmark(index.name + suf, bench_search, dataset, index, i); + b->Unit(benchmark::kMillisecond); + b->UseRealTime(); } - log_info("%s", message.c_str()); + } +} - if (only_check) { - log_info("%s", "all check passed, quit due to option -c"); - return 0; +template +void dispatch_benchmark(const Configuration& conf, + bool force_overwrite, + bool build_mode, + bool search_mode, + std::string data_prefix, + std::string index_prefix, + kv_series override_kv) +{ + if (CUDART_FOUND) { + for (auto [key, value] : cuda_info()) { + ::benchmark::AddCustomContext(key, value); } - - if (build_mode) { - build(&dataset, indices); - } else if (search_mode) { - search(&dataset, indices); + } + const auto dataset_conf = conf.get_dataset_conf(); + auto base_file = combine_path(data_prefix, dataset_conf.base_file); + auto query_file = combine_path(data_prefix, dataset_conf.query_file); + auto gt_file = dataset_conf.groundtruth_neighbors_file; + if (gt_file.has_value()) { gt_file.emplace(combine_path(data_prefix, gt_file.value())); } + auto dataset = std::make_shared>(dataset_conf.name, + base_file, + dataset_conf.subset_first_row, + dataset_conf.subset_size, + query_file, + dataset_conf.distance, + gt_file); + ::benchmark::AddCustomContext("dataset", dataset_conf.name); + ::benchmark::AddCustomContext("distance", dataset_conf.distance); + std::vector indices = conf.get_indices(); + if (build_mode) { + if (file_exists(base_file)) { + log_info("Using the dataset file '%s'", base_file.c_str()); + ::benchmark::AddCustomContext("n_records", std::to_string(dataset->base_set_size())); + ::benchmark::AddCustomContext("dim", std::to_string(dataset->dim())); + } else { + log_warn("Dataset file '%s' does not exist; benchmarking index building is impossible.", + base_file.c_str()); } - } catch (const std::exception& e) { - log_error("exception occurred: %s", e.what()); - return -1; + std::vector more_indices{}; + for (auto& index : indices) { + for (auto param : apply_overrides(index.build_param, override_kv)) { + auto modified_index = index; + modified_index.build_param = param; + modified_index.file = combine_path(index_prefix, modified_index.file); + more_indices.push_back(modified_index); + } + } + register_build(dataset, more_indices, force_overwrite); + } else if (search_mode) { + if (file_exists(query_file)) { + log_info("Using the query file '%s'", query_file.c_str()); + ::benchmark::AddCustomContext("max_n_queries", std::to_string(dataset->query_set_size())); + ::benchmark::AddCustomContext("dim", std::to_string(dataset->dim())); + if (gt_file.has_value()) { + if (file_exists(*gt_file)) { + log_info("Using the ground truth file '%s'", gt_file->c_str()); + ::benchmark::AddCustomContext("max_k", std::to_string(dataset->max_k())); + } else { + log_warn("Ground truth file '%s' does not exist; the recall won't be reported.", + gt_file->c_str()); + } + } else { + log_warn( + "Ground truth file is not provided; the recall won't be reported. NB: use " + "the 'groundtruth_neighbors_file' alongside the 'query_file' key to specify the path to " + "the ground truth in your conf.json."); + } + } else { + log_warn("Query file '%s' does not exist; benchmarking search is impossible.", + query_file.c_str()); + } + for (auto& index : indices) { + index.search_params = apply_overrides(index.search_params, override_kv); + index.file = combine_path(index_prefix, index.file); + } + register_search(dataset, indices); } +} - return 0; +inline auto parse_bool_flag(const char* arg, const char* pat, bool& result) -> bool +{ + if (strcmp(arg, pat) == 0) { + result = true; + return true; + } + return false; } -inline int run_main(int argc, char** argv) +inline auto parse_string_flag(const char* arg, const char* pat, std::string& result) -> bool { - bool force_overwrite = false; - bool build_mode = false; - bool search_mode = false; - bool only_check = false; - std::string index_patterns("*"); - std::string dataset_memory("device"); - - int opt; - while ((opt = getopt(argc, argv, "bscfi:h")) != -1) { - switch (opt) { - case 'b': build_mode = true; break; - case 's': search_mode = true; break; - case 'c': only_check = true; break; - case 'f': force_overwrite = true; break; - case 'i': index_patterns = optarg; break; - case 'm': dataset_memory = optarg; break; - case 'h': cout << usage(argv[0]) << endl; return -1; - default: cerr << "\n" << usage(argv[0]) << endl; return -1; - } + auto n = strlen(pat); + if (strncmp(pat, arg, strlen(pat)) == 0) { + result = arg + n + 1; + return true; } - if (build_mode == search_mode) { - std::cerr << "one and only one of -b and -s should be specified\n\n" << usage(argv[0]) << endl; + return false; +} + +inline auto run_main(int argc, char** argv) -> int +{ + bool force_overwrite = false; + bool build_mode = false; + bool search_mode = false; + std::string data_prefix = "data"; + std::string index_prefix = "index"; + std::string new_override_kv = ""; + kv_series override_kv{}; + + char arg0_default[] = "benchmark"; // NOLINT + char* args_default = arg0_default; + if (!argv) { + argc = 1; + argv = &args_default; + } + if (argc == 1) { + printf_usage(); return -1; } - if (argc - optind != 1) { - std::cerr << usage(argv[0]) << endl; + + char* conf_path = argv[--argc]; + std::ifstream conf_stream(conf_path); + + for (int i = 1; i < argc; i++) { + if (parse_bool_flag(argv[i], "--overwrite", force_overwrite) || + parse_bool_flag(argv[i], "--build", build_mode) || + parse_bool_flag(argv[i], "--search", search_mode) || + parse_string_flag(argv[i], "--data_prefix", data_prefix) || + parse_string_flag(argv[i], "--index_prefix", index_prefix) || + parse_string_flag(argv[i], "--override_kv", new_override_kv)) { + if (!new_override_kv.empty()) { + auto kvv = split(new_override_kv, ':'); + auto key = kvv[0]; + std::vector vals{}; + for (std::size_t j = 1; j < kvv.size(); j++) { + vals.push_back(nlohmann::json::parse(kvv[j])); + } + override_kv.emplace_back(key, vals); + new_override_kv = ""; + } + for (int j = i; j < argc - 1; j++) { + argv[j] = argv[j + 1]; + } + argc--; + i--; + } + } + + if (build_mode == search_mode) { + log_error("One and only one of --build and --search should be specified"); + printf_usage(); return -1; } - string conf_file = argv[optind]; - std::ifstream conf_stream(conf_file.c_str()); if (!conf_stream) { - log_error("can't open configuration file: %s", argv[optind]); + log_error("Can't open configuration file: %s", conf_path); return -1; } - try { - Configuration conf(conf_stream); - std::string dtype = conf.get_dataset_conf().dtype; - - if (dtype == "float") { - return dispatch_benchmark( - conf, index_patterns, force_overwrite, only_check, build_mode, search_mode); - } else if (dtype == "uint8") { - return dispatch_benchmark( - conf, index_patterns, force_overwrite, only_check, build_mode, search_mode); - } else if (dtype == "int8") { - return dispatch_benchmark( - conf, index_patterns, force_overwrite, only_check, build_mode, search_mode); - } else { - log_error("datatype %s not supported", dtype); - } - - } catch (const std::exception& e) { - log_error("exception occurred: %s", e.what()); + if (!CUDART_FOUND) { log_warn("cudart library is not found, GPU-based indices won't work."); } + + Configuration conf(conf_stream); + std::string dtype = conf.get_dataset_conf().dtype; + + if (dtype == "float") { + dispatch_benchmark( + conf, force_overwrite, build_mode, search_mode, data_prefix, index_prefix, override_kv); + } else if (dtype == "uint8") { + dispatch_benchmark( + conf, force_overwrite, build_mode, search_mode, data_prefix, index_prefix, override_kv); + } else if (dtype == "int8") { + dispatch_benchmark( + conf, force_overwrite, build_mode, search_mode, data_prefix, index_prefix, override_kv); + } else { + log_error("datatype '%s' is not supported", dtype.c_str()); return -1; } - return -1; + ::benchmark::Initialize(&argc, argv, printf_usage); + if (::benchmark::ReportUnrecognizedArguments(argc, argv)) return -1; + ::benchmark::RunSpecifiedBenchmarks(); + ::benchmark::Shutdown(); + // Release a possibly cached ANN object, so that it cannot be alive longer than the handle to a + // shared library it depends on (dynamic benchmark executable). + current_algo.reset(); + return 0; } + }; // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/benchmark_util.hpp b/cpp/bench/ann/src/common/benchmark_util.hpp deleted file mode 100644 index 7005883ffc..0000000000 --- a/cpp/bench/ann/src/common/benchmark_util.hpp +++ /dev/null @@ -1,33 +0,0 @@ -/* - * Copyright (c) 2023, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -#pragma once - -#include "ann_types.hpp" -#include - -namespace raft::bench::ann { - -inline Metric parse_metric(const std::string& metric_str) -{ - if (metric_str == "inner_product") { - return raft::bench::ann::Metric::kInnerProduct; - } else if (metric_str == "euclidean") { - return raft::bench::ann::Metric::kEuclidean; - } else { - throw std::runtime_error("invalid metric: '" + metric_str + "'"); - } -} -}; // namespace raft::bench::ann \ No newline at end of file diff --git a/cpp/bench/ann/src/common/conf.cpp b/cpp/bench/ann/src/common/conf.cpp deleted file mode 100644 index dbb1b5347c..0000000000 --- a/cpp/bench/ann/src/common/conf.cpp +++ /dev/null @@ -1,152 +0,0 @@ -/* - * Copyright (c) 2023, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -#include "conf.h" - -#include -#include -#include -#include -#include - -#include "util.h" - -namespace raft::bench::ann { -using std::runtime_error; -using std::string; -using std::unordered_set; -using std::vector; - -Configuration::Configuration(std::istream& conf_stream) -{ - // to enable comments in json - auto conf = nlohmann::json::parse(conf_stream, nullptr, true, true); - - parse_dataset_(conf.at("dataset")); - parse_index_(conf.at("index"), conf.at("search_basic_param")); -} - -vector Configuration::get_indices(const string& patterns) const -{ - vector names; - for (const auto& index : indices_) { - names.push_back(index.name); - } - - auto matched = match_(names, patterns); - if (matched.empty()) { throw runtime_error("no available index matches '" + patterns + "'"); } - - vector res; - for (const auto& index : indices_) { - if (matched.find(index.name) != matched.end()) { res.push_back(index); } - } - return res; -} - -void Configuration::parse_dataset_(const nlohmann::json& conf) -{ - dataset_conf_.name = conf.at("name"); - dataset_conf_.base_file = conf.at("base_file"); - dataset_conf_.query_file = conf.at("query_file"); - dataset_conf_.distance = conf.at("distance"); - - if (conf.contains("subset_first_row")) { - dataset_conf_.subset_first_row = conf.at("subset_first_row"); - } - if (conf.contains("subset_size")) { dataset_conf_.subset_size = conf.at("subset_size"); } - - if (conf.contains("dtype")) { - dataset_conf_.dtype = conf.at("dtype"); - } else { - auto filename = dataset_conf_.base_file; - if (!filename.compare(filename.size() - 4, 4, "fbin")) { - dataset_conf_.dtype = "float"; - } else if (!filename.compare(filename.size() - 5, 5, "u8bin")) { - dataset_conf_.dtype = "uint8"; - } else if (!filename.compare(filename.size() - 5, 5, "i8bin")) { - dataset_conf_.dtype = "int8"; - } else { - log_error("Could not determine data type of the dataset %s", filename.c_str()); - } - } -} - -void Configuration::parse_index_(const nlohmann::json& index_conf, - const nlohmann::json& search_basic_conf) -{ - const int batch_size = search_basic_conf.at("batch_size"); - const int k = search_basic_conf.at("k"); - const int run_count = search_basic_conf.at("run_count"); - - for (const auto& conf : index_conf) { - Index index; - index.name = conf.at("name"); - index.algo = conf.at("algo"); - index.build_param = conf.at("build_param"); - index.file = conf.at("file"); - index.batch_size = batch_size; - index.k = k; - index.run_count = run_count; - index.index_conf = index_conf; - - if (conf.contains("multigpu")) { - for (auto it : conf.at("multigpu")) { - index.dev_list.push_back(it); - } - if (index.dev_list.empty()) { throw std::runtime_error("dev_list shouln't be empty!"); } - index.dev_list.shrink_to_fit(); - index.build_param["multigpu"] = conf["multigpu"]; - } - - if (conf.contains("refine_ratio")) { - float refine_ratio = conf.at("refine_ratio"); - if (refine_ratio <= 1.0f) { - throw runtime_error("'" + index.name + "': refine_ratio should > 1.0"); - } - index.refine_ratio = refine_ratio; - } - - for (const auto& param : conf.at("search_params")) { - index.search_params.push_back(param); - } - index.search_result_file = conf.at("search_result_file"); - - indices_.push_back(index); - } -} - -unordered_set Configuration::match_(const vector& candidates, - const string& patterns) const -{ - unordered_set matched; - for (const auto& pat : split(patterns, ',')) { - if (pat.empty()) { continue; } - - if (pat.back() == '*') { - auto len = pat.size() - 1; - for (const auto& item : candidates) { - if (item.compare(0, len, pat, 0, len) == 0) { matched.insert(item); } - } - } else { - for (const auto& item : candidates) { - if (item == pat) { matched.insert(item); } - } - } - } - - return matched; -} - -} // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/conf.h b/cpp/bench/ann/src/common/conf.h deleted file mode 100644 index bf3a0cba64..0000000000 --- a/cpp/bench/ann/src/common/conf.h +++ /dev/null @@ -1,76 +0,0 @@ -/* - * Copyright (c) 2023, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -#pragma once -#include -#include -#include -#include - -#define JSON_DIAGNOSTICS 1 -#include - -namespace raft::bench::ann { - -class Configuration { - public: - struct Index { - std::string name; - std::string algo; - nlohmann::json build_param; - nlohmann::json index_conf; - std::string file; - std::vector dev_list; - - int batch_size; - int k; - int run_count; - std::vector search_params; - std::string search_result_file; - float refine_ratio{0.0f}; - }; - - struct DatasetConf { - std::string name; - std::string base_file; - // use only a subset of base_file, - // the range of rows is [subset_first_row, subset_first_row + subset_size) - // however, subset_size = 0 means using all rows after subset_first_row - // that is, the subset is [subset_first_row, #rows in base_file) - size_t subset_first_row{0}; - size_t subset_size{0}; - std::string query_file; - std::string distance; - - // data type of input dataset, possible values ["float", "int8", "uint8"] - std::string dtype; - }; - - Configuration(std::istream& conf_stream); - - DatasetConf get_dataset_conf() const { return dataset_conf_; } - std::vector get_indices(const std::string& patterns) const; - - private: - void parse_dataset_(const nlohmann::json& conf); - void parse_index_(const nlohmann::json& index_conf, const nlohmann::json& search_basic_conf); - std::unordered_set match_(const std::vector& candidates, - const std::string& patterns) const; - - DatasetConf dataset_conf_; - std::vector indices_; -}; - -} // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/conf.hpp b/cpp/bench/ann/src/common/conf.hpp new file mode 100644 index 0000000000..405b00a74e --- /dev/null +++ b/cpp/bench/ann/src/common/conf.hpp @@ -0,0 +1,156 @@ +/* + * Copyright (c) 2023, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#pragma once + +#include "util.hpp" + +#include +#include +#include +#include +#include + +#define JSON_DIAGNOSTICS 1 +#include + +namespace raft::bench::ann { + +class Configuration { + public: + struct Index { + std::string name; + std::string algo; + nlohmann::json build_param; + std::string file; + std::vector dev_list; + + int batch_size; + int k; + std::vector search_params; + }; + + struct DatasetConf { + std::string name; + std::string base_file; + // use only a subset of base_file, + // the range of rows is [subset_first_row, subset_first_row + subset_size) + // however, subset_size = 0 means using all rows after subset_first_row + // that is, the subset is [subset_first_row, #rows in base_file) + size_t subset_first_row{0}; + size_t subset_size{0}; + std::string query_file; + std::string distance; + std::optional groundtruth_neighbors_file{std::nullopt}; + + // data type of input dataset, possible values ["float", "int8", "uint8"] + std::string dtype; + }; + + explicit inline Configuration(std::istream& conf_stream) + { + // to enable comments in json + auto conf = nlohmann::json::parse(conf_stream, nullptr, true, true); + + parse_dataset_(conf.at("dataset")); + parse_index_(conf.at("index"), conf.at("search_basic_param")); + } + + [[nodiscard]] inline auto get_dataset_conf() const -> DatasetConf { return dataset_conf_; } + [[nodiscard]] inline auto get_indices() const -> std::vector { return indices_; }; + + private: + inline void parse_dataset_(const nlohmann::json& conf) + { + dataset_conf_.name = conf.at("name"); + dataset_conf_.base_file = conf.at("base_file"); + dataset_conf_.query_file = conf.at("query_file"); + dataset_conf_.distance = conf.at("distance"); + + if (conf.contains("groundtruth_neighbors_file")) { + dataset_conf_.groundtruth_neighbors_file = conf.at("groundtruth_neighbors_file"); + } + if (conf.contains("subset_first_row")) { + dataset_conf_.subset_first_row = conf.at("subset_first_row"); + } + if (conf.contains("subset_size")) { dataset_conf_.subset_size = conf.at("subset_size"); } + + if (conf.contains("dtype")) { + dataset_conf_.dtype = conf.at("dtype"); + } else { + auto filename = dataset_conf_.base_file; + if (!filename.compare(filename.size() - 4, 4, "fbin")) { + dataset_conf_.dtype = "float"; + } else if (!filename.compare(filename.size() - 5, 5, "u8bin")) { + dataset_conf_.dtype = "uint8"; + } else if (!filename.compare(filename.size() - 5, 5, "i8bin")) { + dataset_conf_.dtype = "int8"; + } else { + log_error("Could not determine data type of the dataset %s", filename.c_str()); + } + } + } + inline void parse_index_(const nlohmann::json& index_conf, + const nlohmann::json& search_basic_conf) + { + const int batch_size = search_basic_conf.at("batch_size"); + const int k = search_basic_conf.at("k"); + + for (const auto& conf : index_conf) { + Index index; + index.name = conf.at("name"); + index.algo = conf.at("algo"); + index.build_param = conf.at("build_param"); + index.file = conf.at("file"); + index.batch_size = batch_size; + index.k = k; + + if (conf.contains("multigpu")) { + for (auto it : conf.at("multigpu")) { + index.dev_list.push_back(it); + } + if (index.dev_list.empty()) { throw std::runtime_error("dev_list shouln't be empty!"); } + index.dev_list.shrink_to_fit(); + index.build_param["multigpu"] = conf["multigpu"]; + } + + for (auto param : conf.at("search_params")) { + /* ### Special parameters for backward compatibility ### + + - Local values of `k` and `n_queries` take priority. + - The legacy "batch_size" renamed to `n_queries`. + - Basic search params are used otherwise. + */ + if (!param.contains("k")) { param["k"] = k; } + if (!param.contains("n_queries")) { + if (param.contains("batch_size")) { + param["n_queries"] = param["batch_size"]; + param.erase("batch_size"); + } else { + param["n_queries"] = batch_size; + } + } + index.search_params.push_back(param); + } + + indices_.push_back(index); + } + } + + DatasetConf dataset_conf_; + std::vector indices_; +}; + +} // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/cuda_stub.hpp b/cpp/bench/ann/src/common/cuda_stub.hpp new file mode 100644 index 0000000000..e3f9aa9e84 --- /dev/null +++ b/cpp/bench/ann/src/common/cuda_stub.hpp @@ -0,0 +1,159 @@ +/* + * Copyright (c) 2023, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#pragma once + +#ifdef ANN_BENCH_LINK_CUDART +#include +#else +#define CPU_ONLY +typedef void* cudaStream_t; +typedef void* cudaEvent_t; +#endif + +#include + +namespace raft::bench::ann { + +struct cuda_lib_handle { + void* handle{nullptr}; + explicit cuda_lib_handle() + { +#ifdef ANN_BENCH_LINK_CUDART + handle = dlopen(ANN_BENCH_LINK_CUDART, RTLD_NOW | RTLD_GLOBAL | RTLD_DEEPBIND | RTLD_NODELETE); +#endif + } + ~cuda_lib_handle() noexcept + { + if (handle != nullptr) { dlclose(handle); } + } + + [[nodiscard]] inline auto found() const -> bool { return handle != nullptr; } +}; + +static inline cuda_lib_handle cudart{}; + +#ifndef CPU_ONLY +namespace stub { + +[[gnu::weak, gnu::noinline]] cudaError_t cudaMemcpy(void* dst, + const void* src, + size_t count, + enum cudaMemcpyKind kind) +{ + return cudaSuccess; +} + +[[gnu::weak, gnu::noinline]] cudaError_t cudaMalloc(void** ptr, size_t size) +{ + *ptr = nullptr; + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaMemset(void* devPtr, int value, size_t count) +{ + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaFree(void* devPtr) { return cudaSuccess; } +[[gnu::weak, gnu::noinline]] cudaError_t cudaStreamCreate(cudaStream_t* pStream) +{ + *pStream = 0; + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaStreamCreateWithFlags(cudaStream_t* pStream, + unsigned int flags) +{ + *pStream = 0; + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaStreamDestroy(cudaStream_t pStream) +{ + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaStreamSynchronize(cudaStream_t pStream) +{ + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaEventCreate(cudaEvent_t* event) +{ + *event = 0; + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaEventRecord(cudaEvent_t event, cudaStream_t stream) +{ + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaEventSynchronize(cudaEvent_t event) +{ + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaEventElapsedTime(float* ms, + cudaEvent_t start, + cudaEvent_t end) +{ + *ms = 0; + return cudaSuccess; +} +[[gnu::weak, gnu::noinline]] cudaError_t cudaEventDestroy(cudaEvent_t event) { return cudaSuccess; } +[[gnu::weak, gnu::noinline]] cudaError_t cudaGetDevice(int* device) +{ + *device = 0; + return cudaSuccess; +}; +[[gnu::weak, gnu::noinline]] cudaError_t cudaDriverGetVersion(int* driver) +{ + *driver = 0; + return cudaSuccess; +}; +[[gnu::weak, gnu::noinline]] cudaError_t cudaRuntimeGetVersion(int* runtime) +{ + *runtime = 0; + return cudaSuccess; +}; +[[gnu::weak, gnu::noinline]] cudaError_t cudaGetDeviceProperties(struct cudaDeviceProp* prop, + int device) +{ + *prop = cudaDeviceProp{}; + return cudaSuccess; +} + +} // namespace stub + +#define RAFT_DECLARE_CUDART(fun) \ + static inline decltype(&stub::fun) fun = \ + cudart.found() ? reinterpret_cast(dlsym(cudart.handle, #fun)) \ + : &stub::fun + +RAFT_DECLARE_CUDART(cudaMemcpy); +RAFT_DECLARE_CUDART(cudaMalloc); +RAFT_DECLARE_CUDART(cudaMemset); +RAFT_DECLARE_CUDART(cudaFree); +RAFT_DECLARE_CUDART(cudaStreamCreate); +RAFT_DECLARE_CUDART(cudaStreamCreateWithFlags); +RAFT_DECLARE_CUDART(cudaStreamDestroy); +RAFT_DECLARE_CUDART(cudaStreamSynchronize); +RAFT_DECLARE_CUDART(cudaEventCreate); +RAFT_DECLARE_CUDART(cudaEventRecord); +RAFT_DECLARE_CUDART(cudaEventSynchronize); +RAFT_DECLARE_CUDART(cudaEventElapsedTime); +RAFT_DECLARE_CUDART(cudaEventDestroy); +RAFT_DECLARE_CUDART(cudaGetDevice); +RAFT_DECLARE_CUDART(cudaDriverGetVersion); +RAFT_DECLARE_CUDART(cudaRuntimeGetVersion); +RAFT_DECLARE_CUDART(cudaGetDeviceProperties); + +#undef RAFT_DECLARE_CUDART +#endif + +}; // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/dataset.h b/cpp/bench/ann/src/common/dataset.hpp similarity index 85% rename from cpp/bench/ann/src/common/dataset.h rename to cpp/bench/ann/src/common/dataset.hpp index ae05cd02a1..7fa82a632f 100644 --- a/cpp/bench/ann/src/common/dataset.h +++ b/cpp/bench/ann/src/common/dataset.hpp @@ -15,11 +15,10 @@ */ #pragma once -#include +#include "util.hpp" #ifndef CPU_ONLY #include -#include #else typedef uint16_t half; #endif @@ -29,7 +28,9 @@ typedef uint16_t half; #include #include +#include #include +#include #include #include #include @@ -54,7 +55,8 @@ class BinFile { uint32_t subset_size = 0); ~BinFile() { - if (fp_) { fclose(fp_); } + if (mapped_ptr_ != nullptr) { unmap(); } + if (fp_ != nullptr) { fclose(fp_); } } BinFile(const BinFile&) = delete; BinFile& operator=(const BinFile&) = delete; @@ -101,6 +103,7 @@ class BinFile { int fid = fileno(fp_); mapped_ptr_ = mmap(nullptr, file_size_, PROT_READ, MAP_PRIVATE, fid, 0); if (mapped_ptr_ == MAP_FAILED) { + mapped_ptr_ = nullptr; throw std::runtime_error("mmap error: Value of errno " + std::to_string(errno) + ", " + std::string(strerror(errno))); } @@ -124,11 +127,11 @@ class BinFile { uint32_t subset_first_row_; uint32_t subset_size_; - mutable FILE* fp_; + mutable FILE* fp_{nullptr}; mutable uint32_t nrows_; mutable uint32_t ndims_; mutable size_t file_size_; - mutable void* mapped_ptr_; + mutable void* mapped_ptr_{nullptr}; }; template @@ -254,6 +257,7 @@ class Dataset { std::string name() const { return name_; } std::string distance() const { return distance_; } virtual int dim() const = 0; + virtual uint32_t max_k() const = 0; virtual size_t base_set_size() const = 0; virtual size_t query_set_size() const = 0; @@ -271,12 +275,37 @@ class Dataset { return query_set_; } + const int32_t* gt_set() const + { + if (!gt_set_) { load_gt_set_(); } + return gt_set_; + } + const T* base_set_on_gpu() const; const T* query_set_on_gpu() const; const T* mapped_base_set() const; + auto query_set(MemoryType memory_type) const -> const T* + { + switch (memory_type) { + case MemoryType::Device: return query_set_on_gpu(); + default: return query_set(); + } + } + + auto base_set(MemoryType memory_type) const -> const T* + { + switch (memory_type) { + case MemoryType::Device: return base_set_on_gpu(); + case MemoryType::Host: return base_set(); + case MemoryType::HostMmap: return mapped_base_set(); + default: return nullptr; + } + } + protected: virtual void load_base_set_() const = 0; + virtual void load_gt_set_() const = 0; virtual void load_query_set_() const = 0; virtual void map_base_set_() const = 0; @@ -288,6 +317,7 @@ class Dataset { mutable T* d_base_set_ = nullptr; mutable T* d_query_set_ = nullptr; mutable T* mapped_base_set_ = nullptr; + mutable int32_t* gt_set_ = nullptr; }; template @@ -295,6 +325,7 @@ Dataset::~Dataset() { delete[] base_set_; delete[] query_set_; + delete[] gt_set_; #ifndef CPU_ONLY if (d_base_set_) { cudaFree(d_base_set_); } if (d_query_set_) { cudaFree(d_query_set_); } @@ -307,9 +338,8 @@ const T* Dataset::base_set_on_gpu() const #ifndef CPU_ONLY if (!d_base_set_) { base_set(); - RAFT_CUDA_TRY(cudaMalloc((void**)&d_base_set_, base_set_size() * dim() * sizeof(T))); - RAFT_CUDA_TRY(cudaMemcpy( - d_base_set_, base_set_, base_set_size() * dim() * sizeof(T), cudaMemcpyHostToDevice)); + cudaMalloc((void**)&d_base_set_, base_set_size() * dim() * sizeof(T)); + cudaMemcpy(d_base_set_, base_set_, base_set_size() * dim() * sizeof(T), cudaMemcpyHostToDevice); } #endif return d_base_set_; @@ -321,9 +351,9 @@ const T* Dataset::query_set_on_gpu() const #ifndef CPU_ONLY if (!d_query_set_) { query_set(); - RAFT_CUDA_TRY(cudaMalloc((void**)&d_query_set_, query_set_size() * dim() * sizeof(T))); - RAFT_CUDA_TRY(cudaMemcpy( - d_query_set_, query_set_, query_set_size() * dim() * sizeof(T), cudaMemcpyHostToDevice)); + cudaMalloc((void**)&d_query_set_, query_set_size() * dim() * sizeof(T)); + cudaMemcpy( + d_query_set_, query_set_, query_set_size() * dim() * sizeof(T), cudaMemcpyHostToDevice); } #endif return d_query_set_; @@ -344,27 +374,28 @@ class BinDataset : public Dataset { size_t subset_first_row, size_t subset_size, const std::string& query_file, - const std::string& distance); - ~BinDataset() - { - if (this->mapped_base_set_) { base_file_.unmap(); } - } + const std::string& distance, + const std::optional& groundtruth_neighbors_file); int dim() const override; + uint32_t max_k() const override; size_t base_set_size() const override; size_t query_set_size() const override; private: void load_base_set_() const override; void load_query_set_() const override; + void load_gt_set_() const override; void map_base_set_() const override; mutable int dim_ = 0; + mutable uint32_t max_k_ = 0; mutable size_t base_set_size_ = 0; mutable size_t query_set_size_ = 0; BinFile base_file_; BinFile query_file_; + std::optional> gt_file_{std::nullopt}; }; template @@ -373,11 +404,15 @@ BinDataset::BinDataset(const std::string& name, size_t subset_first_row, size_t subset_size, const std::string& query_file, - const std::string& distance) + const std::string& distance, + const std::optional& groundtruth_neighbors_file) : Dataset(name, distance), base_file_(base_file, "r", subset_first_row, subset_size), query_file_(query_file, "r") { + if (groundtruth_neighbors_file.has_value()) { + gt_file_.emplace(groundtruth_neighbors_file.value(), "r"); + } } template @@ -389,6 +424,13 @@ int BinDataset::dim() const return dim_; } +template +uint32_t BinDataset::max_k() const +{ + if (!this->gt_set_) { load_gt_set_(); } + return max_k_; +} + template size_t BinDataset::query_set_size() const { @@ -437,6 +479,19 @@ void BinDataset::load_query_set_() const query_file_.read(this->query_set_); } +template +void BinDataset::load_gt_set_() const +{ + if (gt_file_.has_value()) { + size_t queries; + int k; + gt_file_->get_shape(&queries, &k); + this->gt_set_ = new std::int32_t[queries * k]; + gt_file_->read(this->gt_set_); + max_k_ = k; + } +} + template void BinDataset::map_base_set_() const { diff --git a/cpp/bench/ann/src/common/util.cpp b/cpp/bench/ann/src/common/util.cpp deleted file mode 100644 index 17636f76d7..0000000000 --- a/cpp/bench/ann/src/common/util.cpp +++ /dev/null @@ -1,68 +0,0 @@ -/* - * Copyright (c) 2023, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -#include "util.h" - -#include -#include - -#include -#include - -namespace raft::bench::ann { - -std::vector split(const std::string& s, char delimiter) -{ - std::vector tokens; - std::string token; - std::istringstream iss(s); - while (getline(iss, token, delimiter)) { - if (!token.empty()) { tokens.push_back(token); } - } - return tokens; -} - -bool file_exists(const std::string& filename) -{ - struct stat statbuf; - if (stat(filename.c_str(), &statbuf) != 0) { return false; } - return S_ISREG(statbuf.st_mode); -} - -bool dir_exists(const std::string& dir) -{ - struct stat statbuf; - if (stat(dir.c_str(), &statbuf) != 0) { return false; } - return S_ISDIR(statbuf.st_mode); -} - -bool create_dir(const std::string& dir) -{ - const auto path = split(dir, '/'); - - std::string cwd; - if (!dir.empty() && dir[0] == '/') { cwd += '/'; } - - for (const auto& p : path) { - cwd += p + "/"; - if (!dir_exists(cwd)) { - int ret = mkdir(cwd.c_str(), S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH); - if (ret != 0) { return false; } - } - } - return true; -} - -} // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/util.h b/cpp/bench/ann/src/common/util.h deleted file mode 100644 index 290bf4cea9..0000000000 --- a/cpp/bench/ann/src/common/util.h +++ /dev/null @@ -1,79 +0,0 @@ -/* - * Copyright (c) 2023, NVIDIA CORPORATION. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -#pragma once - -#include -#include -#include -#include -#include -#include - -namespace raft::bench::ann { - -class Timer { - public: - Timer() { reset(); } - void reset() { start_time_ = std::chrono::steady_clock::now(); } - float elapsed_ms() - { - auto end_time = std::chrono::steady_clock::now(); - auto dur = - std::chrono::duration_cast>(end_time - start_time_); - return dur.count(); - } - - private: - std::chrono::steady_clock::time_point start_time_; -}; - -std::vector split(const std::string& s, char delimiter); - -bool file_exists(const std::string& filename); -bool dir_exists(const std::string& dir); -bool create_dir(const std::string& dir); - -template -void log_(const char* level, Ts... vs) -{ - char buf[20]; - std::time_t now = std::time(nullptr); - std::strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", std::localtime(&now)); - printf("%s [%s] ", buf, level); - printf(vs...); - printf("\n"); - fflush(stdout); -} - -template -void log_info(Ts... vs) -{ - log_("info", vs...); -} - -template -void log_warn(Ts... vs) -{ - log_("warn", vs...); -} - -template -void log_error(Ts... vs) -{ - log_("error", vs...); -} - -} // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/common/util.hpp b/cpp/bench/ann/src/common/util.hpp new file mode 100644 index 0000000000..faf440071d --- /dev/null +++ b/cpp/bench/ann/src/common/util.hpp @@ -0,0 +1,347 @@ +/* + * Copyright (c) 2023, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#pragma once + +#include "ann_types.hpp" + +#ifdef ANN_BENCH_NVTX3_HEADERS_FOUND +#include +#endif + +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +namespace raft::bench::ann { + +template +struct buf { + MemoryType memory_type; + std::size_t size; + T* data; + buf(MemoryType memory_type, std::size_t size) + : memory_type(memory_type), size(size), data(nullptr) + { + switch (memory_type) { +#ifndef CPU_ONLY + case MemoryType::Device: { + cudaMalloc(reinterpret_cast(&data), size * sizeof(T)); + cudaMemset(data, 0, size * sizeof(T)); + } break; +#endif + default: { + data = reinterpret_cast(malloc(size * sizeof(T))); + std::memset(data, 0, size * sizeof(T)); + } + } + } + ~buf() noexcept + { + if (data == nullptr) { return; } + switch (memory_type) { +#ifndef CPU_ONLY + case MemoryType::Device: { + cudaFree(data); + } break; +#endif + default: { + free(data); + } + } + } + + [[nodiscard]] auto move(MemoryType target_memory_type) -> buf + { + buf r{target_memory_type, size}; +#ifndef CPU_ONLY + if ((memory_type == MemoryType::Device && target_memory_type != MemoryType::Device) || + (memory_type != MemoryType::Device && target_memory_type == MemoryType::Device)) { + cudaMemcpy(r.data, data, size * sizeof(T), cudaMemcpyDefault); + return r; + } +#endif + std::swap(data, r.data); + return r; + } +}; + +struct cuda_timer { + private: + cudaStream_t stream_{nullptr}; + cudaEvent_t start_{nullptr}; + cudaEvent_t stop_{nullptr}; + double total_time_{0}; + + public: + struct cuda_lap { + private: + cudaStream_t stream_; + cudaEvent_t start_; + cudaEvent_t stop_; + double& total_time_; + + public: + cuda_lap(cudaStream_t stream, cudaEvent_t start, cudaEvent_t stop, double& total_time) + : start_(start), stop_(stop), stream_(stream), total_time_(total_time) + { +#ifndef CPU_ONLY + cudaStreamSynchronize(stream_); + cudaEventRecord(start_, stream_); +#endif + } + cuda_lap() = delete; + + ~cuda_lap() noexcept + { +#ifndef CPU_ONLY + cudaEventRecord(stop_, stream_); + cudaEventSynchronize(stop_); + float milliseconds = 0.0f; + cudaEventElapsedTime(&milliseconds, start_, stop_); + total_time_ += milliseconds / 1000.0; +#endif + } + }; + + cuda_timer() + { +#ifndef CPU_ONLY + cudaStreamCreateWithFlags(&stream_, cudaStreamNonBlocking); + cudaEventCreate(&stop_); + cudaEventCreate(&start_); +#endif + } + + ~cuda_timer() noexcept + { +#ifndef CPU_ONLY + cudaEventDestroy(start_); + cudaEventDestroy(stop_); + cudaStreamDestroy(stream_); +#endif + } + + [[nodiscard]] auto stream() const -> cudaStream_t { return stream_; } + + [[nodiscard]] auto total_time() const -> double { return total_time_; } + + [[nodiscard]] auto lap() -> cuda_timer::cuda_lap + { + return cuda_lap{stream_, start_, stop_, total_time_}; + } +}; + +inline auto cuda_info() +{ + std::vector> props; +#ifndef CPU_ONLY + int dev, driver = 0, runtime = 0; + cudaDriverGetVersion(&driver); + cudaRuntimeGetVersion(&runtime); + + cudaDeviceProp device_prop; + cudaGetDevice(&dev); + cudaGetDeviceProperties(&device_prop, dev); + props.emplace_back("gpu_name", std::string(device_prop.name)); + props.emplace_back("gpu_sm_count", std::to_string(device_prop.multiProcessorCount)); + props.emplace_back("gpu_sm_freq", std::to_string(device_prop.clockRate * 1e3)); + props.emplace_back("gpu_mem_freq", std::to_string(device_prop.memoryClockRate * 1e3)); + props.emplace_back("gpu_mem_bus_width", std::to_string(device_prop.memoryBusWidth)); + props.emplace_back("gpu_mem_global_size", std::to_string(device_prop.totalGlobalMem)); + props.emplace_back("gpu_mem_shared_size", std::to_string(device_prop.sharedMemPerMultiprocessor)); + props.emplace_back("gpu_driver_version", + std::to_string(driver / 1000) + "." + std::to_string((driver % 100) / 10)); + props.emplace_back("gpu_runtime_version", + std::to_string(runtime / 1000) + "." + std::to_string((runtime % 100) / 10)); +#endif + return props; +} + +struct nvtx_case { +#ifdef ANN_BENCH_NVTX3_HEADERS_FOUND + private: + std::string case_name_; + std::array iter_name_{0}; + nvtxDomainHandle_t domain_; + int64_t iteration_ = 0; + nvtxEventAttributes_t case_attrib_{0}; + nvtxEventAttributes_t iter_attrib_{0}; +#endif + + public: + struct nvtx_lap { +#ifdef ANN_BENCH_NVTX3_HEADERS_FOUND + private: + nvtxDomainHandle_t domain_; + + public: + nvtx_lap(nvtxDomainHandle_t domain, nvtxEventAttributes_t* attr) : domain_(domain) + { + nvtxDomainRangePushEx(domain_, attr); + } + nvtx_lap() = delete; + ~nvtx_lap() noexcept { nvtxDomainRangePop(domain_); } +#endif + }; + +#ifdef ANN_BENCH_NVTX3_HEADERS_FOUND + explicit nvtx_case(std::string case_name) + : case_name_(std::move(case_name)), domain_(nvtxDomainCreateA("ANN benchmark")) + { + case_attrib_.version = NVTX_VERSION; + iter_attrib_.version = NVTX_VERSION; + case_attrib_.size = NVTX_EVENT_ATTRIB_STRUCT_SIZE; + iter_attrib_.size = NVTX_EVENT_ATTRIB_STRUCT_SIZE; + case_attrib_.colorType = NVTX_COLOR_ARGB; + iter_attrib_.colorType = NVTX_COLOR_ARGB; + case_attrib_.messageType = NVTX_MESSAGE_TYPE_ASCII; + iter_attrib_.messageType = NVTX_MESSAGE_TYPE_ASCII; + case_attrib_.message.ascii = case_name_.c_str(); + auto c = std::hash{}(case_name_); + case_attrib_.color = c | 0xA0A0A0; + nvtxDomainRangePushEx(domain_, &case_attrib_); + } + + ~nvtx_case() + { + nvtxDomainRangePop(domain_); + nvtxDomainDestroy(domain_); + } +#else + explicit nvtx_case(std::string) {} +#endif + + [[nodiscard]] auto lap() -> nvtx_case::nvtx_lap + { +#ifdef ANN_BENCH_NVTX3_HEADERS_FOUND + auto i = iteration_++; + uint32_t c = (i % 5); + uint32_t r = 150 + c * 20; + uint32_t g = 200 + c * 10; + uint32_t b = 220 + c * 5; + std::snprintf(iter_name_.data(), iter_name_.size(), "Lap %zd", i); + iter_attrib_.message.ascii = iter_name_.data(); + iter_attrib_.color = (r << 16) + (g << 8) + b; + return nvtx_lap{domain_, &iter_attrib_}; +#else + return nvtx_lap{}; +#endif + } +}; + +inline std::vector split(const std::string& s, char delimiter) +{ + std::vector tokens; + std::string token; + std::istringstream iss(s); + while (getline(iss, token, delimiter)) { + if (!token.empty()) { tokens.push_back(token); } + } + return tokens; +} + +inline bool file_exists(const std::string& filename) +{ + struct stat statbuf; + if (stat(filename.c_str(), &statbuf) != 0) { return false; } + return S_ISREG(statbuf.st_mode); +} + +inline bool dir_exists(const std::string& dir) +{ + struct stat statbuf; + if (stat(dir.c_str(), &statbuf) != 0) { return false; } + return S_ISDIR(statbuf.st_mode); +} + +inline bool create_dir(const std::string& dir) +{ + const auto path = split(dir, '/'); + + std::string cwd; + if (!dir.empty() && dir[0] == '/') { cwd += '/'; } + + for (const auto& p : path) { + cwd += p + "/"; + if (!dir_exists(cwd)) { + int ret = mkdir(cwd.c_str(), S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH); + if (ret != 0) { return false; } + } + } + return true; +} + +inline void make_sure_parent_dir_exists(const std::string& file_path) +{ + const auto pos = file_path.rfind('/'); + if (pos != std::string::npos) { + auto dir = file_path.substr(0, pos); + if (!dir_exists(dir)) { create_dir(dir); } + } +} + +inline auto combine_path(const std::string& dir, const std::string& path) +{ + std::filesystem::path p_dir(dir); + std::filesystem::path p_suf(path); + return (p_dir / p_suf).string(); +} + +template +void log_(const char* level, const Ts&... vs) +{ + char buf[20]; + std::time_t now = std::time(nullptr); + std::strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", std::localtime(&now)); + printf("%s [%s] ", buf, level); + if constexpr (sizeof...(Ts) == 1) { + printf("%s", vs...); + } else { + printf(vs...); + } + printf("\n"); + fflush(stdout); +} + +template +void log_info(Ts&&... vs) +{ + log_("info", std::forward(vs)...); +} + +template +void log_warn(Ts&&... vs) +{ + log_("warn", std::forward(vs)...); +} + +template +void log_error(Ts&&... vs) +{ + log_("error", std::forward(vs)...); +} + +} // namespace raft::bench::ann diff --git a/cpp/bench/ann/src/faiss/faiss_benchmark.cu b/cpp/bench/ann/src/faiss/faiss_benchmark.cu index a01702fb1f..619565d107 100644 --- a/cpp/bench/ann/src/faiss/faiss_benchmark.cu +++ b/cpp/bench/ann/src/faiss/faiss_benchmark.cu @@ -97,10 +97,8 @@ template std::unique_ptr> create_algo(const std::string& algo, const std::string& distance, int dim, - float refine_ratio, const nlohmann::json& conf, - const std::vector& dev_list, - const nlohmann::json& index_conf) + const std::vector& dev_list) { // stop compiler warning; not all algorithms support multi-GPU so it may not be used (void)dev_list; @@ -124,7 +122,6 @@ std::unique_ptr> create_algo(const std::string& algo, if (!ann) { throw std::runtime_error("invalid algo: '" + algo + "'"); } - if (refine_ratio > 1.0) {} return ann; } @@ -146,6 +143,11 @@ std::unique_ptr::AnnSearchParam> create_search } // namespace raft::bench::ann -#include "../common/benchmark.hpp" +REGISTER_ALGO_INSTANCE(float); +REGISTER_ALGO_INSTANCE(std::int8_t); +REGISTER_ALGO_INSTANCE(std::uint8_t); +#ifdef ANN_BENCH_BUILD_MAIN +#include "../common/benchmark.hpp" int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } +#endif diff --git a/cpp/bench/ann/src/faiss/faiss_wrapper.h b/cpp/bench/ann/src/faiss/faiss_wrapper.h index 8cfc26ea5b..7a3f91853f 100644 --- a/cpp/bench/ann/src/faiss/faiss_wrapper.h +++ b/cpp/bench/ann/src/faiss/faiss_wrapper.h @@ -16,6 +16,10 @@ #ifndef FAISS_WRAPPER_H_ #define FAISS_WRAPPER_H_ +#include "../common/ann_types.hpp" + +#include + #include #include #include @@ -35,10 +39,6 @@ #include #include -#include "../common/ann_types.hpp" -#include "../common/benchmark_util.hpp" -#include - namespace { faiss::MetricType parse_metric_type(raft::bench::ann::Metric metric) @@ -84,6 +84,7 @@ class FaissGpu : public ANN { }; FaissGpu(Metric metric, int dim, int nlist); + virtual ~FaissGpu() noexcept { RAFT_CUDA_TRY_NO_THROW(cudaEventDestroy(sync_)); } void build(const T* dataset, size_t nrow, cudaStream_t stream = 0) final; @@ -98,13 +99,12 @@ class FaissGpu : public ANN { float* distances, cudaStream_t stream = 0) const final; - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; // to enable building big dataset which is larger than GPU memory - property.dataset_memory_type = MemoryType::Host; - property.query_memory_type = MemoryType::Device; - property.need_dataset_when_search = false; + property.dataset_memory_type = MemoryType::Host; + property.query_memory_type = MemoryType::Device; return property; } @@ -115,11 +115,19 @@ class FaissGpu : public ANN { template void load_(const std::string& file); + void stream_wait(cudaStream_t stream) const + { + RAFT_CUDA_TRY(cudaEventRecord(sync_, faiss_default_stream_)); + RAFT_CUDA_TRY(cudaStreamWaitEvent(stream, sync_)); + } + mutable faiss::gpu::StandardGpuResources gpu_resource_; std::unique_ptr index_; faiss::MetricType metric_type_; int nlist_; int device_; + cudaEvent_t sync_{nullptr}; + cudaStream_t faiss_default_stream_{nullptr}; }; template @@ -128,6 +136,8 @@ FaissGpu::FaissGpu(Metric metric, int dim, int nlist) { static_assert(std::is_same_v, "faiss support only float type"); RAFT_CUDA_TRY(cudaGetDevice(&device_)); + RAFT_CUDA_TRY(cudaEventCreate(&sync_, cudaEventDisableTiming)); + faiss_default_stream_ = gpu_resource_.getDefaultStream(device_); } template @@ -135,10 +145,10 @@ void FaissGpu::build(const T* dataset, size_t nrow, cudaStream_t stream) { OmpSingleThreadScope omp_single_thread; - gpu_resource_.setDefaultStream(device_, stream); index_->train(nrow, dataset); // faiss::gpu::GpuIndexFlat::train() will do nothing assert(index_->is_trained); index_->add(nrow, dataset); + stream_wait(stream); } template @@ -159,9 +169,9 @@ void FaissGpu::search(const T* queries, { static_assert(sizeof(size_t) == sizeof(faiss::Index::idx_t), "sizes of size_t and faiss::Index::idx_t are different"); - gpu_resource_.setDefaultStream(device_, stream); index_->search( batch_size, queries, k, distances, reinterpret_cast(neighbors)); + stream_wait(stream); } template diff --git a/cpp/bench/ann/src/ggnn/ggnn_benchmark.cu b/cpp/bench/ann/src/ggnn/ggnn_benchmark.cu index 63ceb5d771..99481c2921 100644 --- a/cpp/bench/ann/src/ggnn/ggnn_benchmark.cu +++ b/cpp/bench/ann/src/ggnn/ggnn_benchmark.cu @@ -84,10 +84,8 @@ template std::unique_ptr> create_algo(const std::string& algo, const std::string& distance, int dim, - float refine_ratio, const nlohmann::json& conf, - const std::vector& dev_list, - const nlohmann::json& index_conf) + const std::vector& dev_list) { // stop compiler warning; not all algorithms support multi-GPU so it may not be used (void)dev_list; @@ -102,7 +100,6 @@ std::unique_ptr> create_algo(const std::string& algo, if (algo == "ggnn") { ann = make_algo(metric, dim, conf); } if (!ann) { throw std::runtime_error("invalid algo: '" + algo + "'"); } - if (refine_ratio > 1.0) {} return ann; } @@ -121,6 +118,11 @@ std::unique_ptr::AnnSearchParam> create_search } // namespace raft::bench::ann -#include "../common/benchmark.hpp" +REGISTER_ALGO_INSTANCE(float); +REGISTER_ALGO_INSTANCE(std::int8_t); +REGISTER_ALGO_INSTANCE(std::uint8_t); -int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } \ No newline at end of file +#ifdef ANN_BENCH_BUILD_MAIN +#include "../common/benchmark.hpp" +int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } +#endif diff --git a/cpp/bench/ann/src/ggnn/ggnn_wrapper.cuh b/cpp/bench/ann/src/ggnn/ggnn_wrapper.cuh index fd8fe0f2ec..74c7cddc3c 100644 --- a/cpp/bench/ann/src/ggnn/ggnn_wrapper.cuh +++ b/cpp/bench/ann/src/ggnn/ggnn_wrapper.cuh @@ -16,14 +16,14 @@ #pragma once -#include -#include - #include "../common/ann_types.hpp" -#include "../common/benchmark_util.hpp" + #include #include +#include +#include + namespace raft::bench::ann { template @@ -50,6 +50,7 @@ class Ggnn : public ANN { int max_iterations{400}; int cache_size{512}; int sorted_size{256}; + auto needs_dataset() const -> bool override { return true; } }; Ggnn(Metric metric, int dim, const BuildParam& param); @@ -74,7 +75,7 @@ class Ggnn : public ANN { void save(const std::string& file) const override { impl_->save(file); } void load(const std::string& file) override { impl_->load(file); } - AlgoProperty get_property() const override { return impl_->get_property(); } + AlgoProperty get_preference() const override { return impl_->get_preference(); } void set_search_dataset(const T* dataset, size_t nrow) override { @@ -135,12 +136,11 @@ class GgnnImpl : public ANN { void save(const std::string& file) const override; void load(const std::string& file) override; - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; - property.dataset_memory_type = MemoryType::Device; - property.query_memory_type = MemoryType::Device; - property.need_dataset_when_search = true; + property.dataset_memory_type = MemoryType::Device; + property.query_memory_type = MemoryType::Device; return property; } diff --git a/cpp/bench/ann/src/hnswlib/hnswlib_benchmark.cpp b/cpp/bench/ann/src/hnswlib/hnswlib_benchmark.cpp index 885d0e3ba4..be5b72c5f6 100644 --- a/cpp/bench/ann/src/hnswlib/hnswlib_benchmark.cpp +++ b/cpp/bench/ann/src/hnswlib/hnswlib_benchmark.cpp @@ -14,6 +14,8 @@ * limitations under the License. */ +#include "../common/ann_types.hpp" + #include #include #include @@ -22,9 +24,6 @@ #include #include -#include "../common/benchmark_util.hpp" - -#include "../common/ann_types.hpp" #undef WARP_SIZE #include "hnswlib_wrapper.h" #define JSON_DIAGNOSTICS 1 @@ -76,10 +75,8 @@ template std::unique_ptr> create_algo(const std::string& algo, const std::string& distance, int dim, - float refine_ratio, const nlohmann::json& conf, - const std::vector& dev_list, - const nlohmann::json& index_conf) + const std::vector& dev_list) { // stop compiler warning; not all algorithms support multi-GPU so it may not be used (void)dev_list; @@ -96,8 +93,6 @@ std::unique_ptr> create_algo(const std::string& algo, } if (!ann) { throw std::runtime_error("invalid algo: '" + algo + "'"); } - - if (refine_ratio > 1.0) {} return ann; } @@ -116,6 +111,12 @@ std::unique_ptr::AnnSearchParam> create_search }; // namespace raft::bench::ann -#include "../common/benchmark.hpp" +REGISTER_ALGO_INSTANCE(float); +REGISTER_ALGO_INSTANCE(std::int8_t); +REGISTER_ALGO_INSTANCE(std::uint8_t); -int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } \ No newline at end of file +#ifdef ANN_BENCH_BUILD_MAIN +#define CPU_ONLY +#include "../common/benchmark.hpp" +int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } +#endif diff --git a/cpp/bench/ann/src/hnswlib/hnswlib_wrapper.h b/cpp/bench/ann/src/hnswlib/hnswlib_wrapper.h index c5c3a4a2a6..5cd33ef94d 100644 --- a/cpp/bench/ann/src/hnswlib/hnswlib_wrapper.h +++ b/cpp/bench/ann/src/hnswlib/hnswlib_wrapper.h @@ -188,12 +188,11 @@ class HnswLib : public ANN { void save(const std::string& path_to_index) const override; void load(const std::string& path_to_index) override; - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; - property.dataset_memory_type = MemoryType::Host; - property.query_memory_type = MemoryType::Host; - property.need_dataset_when_search = false; + property.dataset_memory_type = MemoryType::Host; + property.query_memory_type = MemoryType::Host; return property; } diff --git a/cpp/bench/ann/src/raft/raft_benchmark.cu b/cpp/bench/ann/src/raft/raft_benchmark.cu index 24f0df4b47..823fa3f2f3 100644 --- a/cpp/bench/ann/src/raft/raft_benchmark.cu +++ b/cpp/bench/ann/src/raft/raft_benchmark.cu @@ -14,6 +14,8 @@ * limitations under the License. */ +#include "../common/ann_types.hpp" + #include #include #include @@ -22,8 +24,6 @@ #include #include -#include "../common/ann_types.hpp" -#include "../common/benchmark_util.hpp" #undef WARP_SIZE #ifdef RAFT_ANN_BENCH_USE_RAFT_BFKNN #include "raft_wrapper.h" @@ -120,6 +120,10 @@ void parse_search_param(const nlohmann::json& conf, // set half as default param.pq_param.lut_dtype = CUDA_R_16F; } + if (conf.contains("refine_ratio")) { + param.refine_ratio = conf.at("refine_ratio"); + if (param.refine_ratio < 1.0f) { throw std::runtime_error("refine_ratio should be >= 1.0"); } + } } #endif @@ -165,31 +169,18 @@ template std::unique_ptr> create_algo(const std::string& algo, const std::string& distance, int dim, - float refine_ratio, const nlohmann::json& conf, - const std::vector& dev_list, - const nlohmann::json& index_conf) + const std::vector& dev_list) { // stop compiler warning; not all algorithms support multi-GPU so it may not be used (void)dev_list; raft::bench::ann::Metric metric = parse_metric(distance); - std::string memtype = conf.at("dataset_memtype"); - - MemoryType dataset_memorytype = MemoryType::Device; - if (memtype == "host") { - dataset_memorytype = MemoryType::Host; - } else if (memtype == "mmap") { - dataset_memorytype = MemoryType::HostMmap; - } - std::unique_ptr> ann; if constexpr (std::is_same_v) { #ifdef RAFT_ANN_BENCH_USE_RAFT_BFKNN - if (algo == "raft_bfknn") { - ann = std::make_unique>(metric, dim, dataset_memorytype); - } + if (algo == "raft_bfknn") { ann = std::make_unique>(metric, dim); } #endif } @@ -199,29 +190,25 @@ std::unique_ptr> create_algo(const std::string& algo, if (algo == "raft_ivf_flat") { typename raft::bench::ann::RaftIvfFlatGpu::BuildParam param; parse_build_param(conf, param); - ann = std::make_unique>( - metric, dim, param, dataset_memorytype); + ann = std::make_unique>(metric, dim, param); } #endif #ifdef RAFT_ANN_BENCH_USE_RAFT_IVF_PQ if (algo == "raft_ivf_pq") { typename raft::bench::ann::RaftIvfPQ::BuildParam param; parse_build_param(conf, param); - ann = std::make_unique>( - metric, dim, param, refine_ratio, dataset_memorytype); + ann = std::make_unique>(metric, dim, param); } #endif #ifdef RAFT_ANN_BENCH_USE_RAFT_CAGRA if (algo == "raft_cagra") { typename raft::bench::ann::RaftCagra::BuildParam param; parse_build_param(conf, param); - ann = std::make_unique>( - metric, dim, param, dataset_memorytype); + ann = std::make_unique>(metric, dim, param); } #endif if (!ann) { throw std::runtime_error("invalid algo: '" + algo + "'"); } - if (refine_ratio > 1.0) {} return ann; } @@ -263,6 +250,11 @@ std::unique_ptr::AnnSearchParam> create_search }; // namespace raft::bench::ann -#include "../common/benchmark.hpp" +REGISTER_ALGO_INSTANCE(float); +REGISTER_ALGO_INSTANCE(std::int8_t); +REGISTER_ALGO_INSTANCE(std::uint8_t); +#ifdef ANN_BENCH_BUILD_MAIN +#include "../common/benchmark.hpp" int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); } +#endif diff --git a/cpp/bench/ann/src/raft/raft_cagra_wrapper.h b/cpp/bench/ann/src/raft/raft_cagra_wrapper.h index c8bfe9b401..02aa2ea28b 100644 --- a/cpp/bench/ann/src/raft/raft_cagra_wrapper.h +++ b/cpp/bench/ann/src/raft/raft_cagra_wrapper.h @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -48,11 +49,23 @@ class RaftCagra : public ANN { struct SearchParam : public AnnSearchParam { raft::neighbors::experimental::cagra::search_params p; + auto needs_dataset() const -> bool override { return true; } }; using BuildParam = raft::neighbors::cagra::index_params; - RaftCagra(Metric metric, int dim, const BuildParam& param, MemoryType dataset_memtype); + RaftCagra(Metric metric, int dim, const BuildParam& param) + : ANN(metric, dim), + index_params_(param), + dimension_(dim), + mr_(rmm::mr::get_current_device_resource(), 1024 * 1024 * 1024ull) + { + rmm::mr::set_current_device_resource(&mr_); + index_params_.metric = parse_metric_type(metric); + RAFT_CUDA_TRY(cudaGetDevice(&device_)); + } + + ~RaftCagra() noexcept { rmm::mr::set_current_device_resource(mr_.get_upstream()); } void build(const T* dataset, size_t nrow, cudaStream_t stream) final; @@ -70,57 +83,43 @@ class RaftCagra : public ANN { cudaStream_t stream = 0) const override; // to enable dataset access from GPU memory - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; - property.dataset_memory_type = dataset_memtype_; - property.query_memory_type = MemoryType::Device; - property.need_dataset_when_search = true; + property.dataset_memory_type = MemoryType::HostMmap; + property.query_memory_type = MemoryType::Device; return property; } void save(const std::string& file) const override; void load(const std::string&) override; - ~RaftCagra() noexcept { rmm::mr::set_current_device_resource(mr_.get_upstream()); } - private: + // `mr_` must go first to make sure it dies last + rmm::mr::pool_memory_resource mr_; raft::device_resources handle_; BuildParam index_params_; raft::neighbors::cagra::search_params search_params_; std::optional> index_; int device_; int dimension_; - MemoryType dataset_memtype_; - rmm::mr::pool_memory_resource mr_; }; -template -RaftCagra::RaftCagra(Metric metric, - int dim, - const BuildParam& param, - MemoryType dataset_memtype) - : ANN(metric, dim), - index_params_(param), - dimension_(dim), - dataset_memtype_(dataset_memtype), - mr_(rmm::mr::get_current_device_resource(), 1024 * 1024 * 1024ull) -{ - rmm::mr::set_current_device_resource(&mr_); - index_params_.metric = parse_metric_type(metric); - RAFT_CUDA_TRY(cudaGetDevice(&device_)); -} - template void RaftCagra::build(const T* dataset, size_t nrow, cudaStream_t) { - if (get_property().dataset_memory_type != MemoryType::Device) { - auto dataset_view = - raft::make_host_matrix_view(dataset, IdxT(nrow), dimension_); - index_.emplace(raft::neighbors::cagra::build(handle_, index_params_, dataset_view)); - } else { - auto dataset_view = - raft::make_device_matrix_view(dataset, IdxT(nrow), dimension_); - index_.emplace(raft::neighbors::cagra::build(handle_, index_params_, dataset_view)); + switch (raft::spatial::knn::detail::utils::check_pointer_residency(dataset)) { + case raft::spatial::knn::detail::utils::pointer_residency::host_only: { + auto dataset_view = + raft::make_host_matrix_view(dataset, IdxT(nrow), dimension_); + index_.emplace(raft::neighbors::cagra::build(handle_, index_params_, dataset_view)); + return; + } + default: { + auto dataset_view = + raft::make_device_matrix_view(dataset, IdxT(nrow), dimension_); + index_.emplace(raft::neighbors::cagra::build(handle_, index_params_, dataset_view)); + return; + } } } diff --git a/cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h b/cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h index 69057c7a4f..da457e32f1 100644 --- a/cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h +++ b/cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h @@ -52,7 +52,19 @@ class RaftIvfFlatGpu : public ANN { using BuildParam = raft::neighbors::ivf_flat::index_params; - RaftIvfFlatGpu(Metric metric, int dim, const BuildParam& param, MemoryType dataset_memtype); + RaftIvfFlatGpu(Metric metric, int dim, const BuildParam& param) + : ANN(metric, dim), + index_params_(param), + dimension_(dim), + mr_(rmm::mr::get_current_device_resource(), 1024 * 1024 * 1024ull) + { + index_params_.metric = parse_metric_type(metric); + index_params_.conservative_memory_allocation = true; + rmm::mr::set_current_device_resource(&mr_); + RAFT_CUDA_TRY(cudaGetDevice(&device_)); + } + + ~RaftIvfFlatGpu() noexcept { rmm::mr::set_current_device_resource(mr_.get_upstream()); } void build(const T* dataset, size_t nrow, cudaStream_t stream) final; @@ -68,47 +80,27 @@ class RaftIvfFlatGpu : public ANN { cudaStream_t stream = 0) const override; // to enable dataset access from GPU memory - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; - property.dataset_memory_type = dataset_memtype_; - property.query_memory_type = MemoryType::Device; - property.need_dataset_when_search = false; + property.dataset_memory_type = MemoryType::Device; + property.query_memory_type = MemoryType::Device; return property; } void save(const std::string& file) const override; void load(const std::string&) override; - ~RaftIvfFlatGpu() noexcept { rmm::mr::set_current_device_resource(mr_.get_upstream()); } - private: + // `mr_` must go first to make sure it dies last + rmm::mr::pool_memory_resource mr_; raft::device_resources handle_; BuildParam index_params_; raft::neighbors::ivf_flat::search_params search_params_; std::optional> index_; int device_; int dimension_; - MemoryType dataset_memtype_; - rmm::mr::pool_memory_resource mr_; }; -template -RaftIvfFlatGpu::RaftIvfFlatGpu(Metric metric, - int dim, - const BuildParam& param, - MemoryType dataset_memtype) - : ANN(metric, dim), - index_params_(param), - dimension_(dim), - dataset_memtype_(dataset_memtype), - mr_(rmm::mr::get_current_device_resource(), 1024 * 1024 * 1024ull) -{ - index_params_.metric = parse_metric_type(metric); - index_params_.conservative_memory_allocation = true; - rmm::mr::set_current_device_resource(&mr_); - RAFT_CUDA_TRY(cudaGetDevice(&device_)); -} - template void RaftIvfFlatGpu::build(const T* dataset, size_t nrow, cudaStream_t) { diff --git a/cpp/bench/ann/src/raft/raft_ivf_pq_wrapper.h b/cpp/bench/ann/src/raft/raft_ivf_pq_wrapper.h index d03912afc2..0d4bca75cc 100644 --- a/cpp/bench/ann/src/raft/raft_ivf_pq_wrapper.h +++ b/cpp/bench/ann/src/raft/raft_ivf_pq_wrapper.h @@ -25,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -47,15 +48,24 @@ class RaftIvfPQ : public ANN { struct SearchParam : public AnnSearchParam { raft::neighbors::ivf_pq::search_params pq_param; + float refine_ratio = 1.0f; + auto needs_dataset() const -> bool override { return refine_ratio > 1.0f; } }; using BuildParam = raft::neighbors::ivf_pq::index_params; - RaftIvfPQ(Metric metric, - int dim, - const BuildParam& param, - float refine_ratio, - MemoryType dataset_memtype); + RaftIvfPQ(Metric metric, int dim, const BuildParam& param) + : ANN(metric, dim), + index_params_(param), + dimension_(dim), + mr_(rmm::mr::get_current_device_resource(), 1024 * 1024 * 1024ull) + { + rmm::mr::set_current_device_resource(&mr_); + index_params_.metric = parse_metric_type(metric); + RAFT_CUDA_TRY(cudaGetDevice(&device_)); + } + + ~RaftIvfPQ() noexcept { rmm::mr::set_current_device_resource(mr_.get_upstream()); } void build(const T* dataset, size_t nrow, cudaStream_t stream) final; @@ -72,20 +82,19 @@ class RaftIvfPQ : public ANN { cudaStream_t stream = 0) const override; // to enable dataset access from GPU memory - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; - property.dataset_memory_type = dataset_memtype_; - property.query_memory_type = MemoryType::Device; - property.need_dataset_when_search = refine_ratio_ > 1.0; + property.dataset_memory_type = MemoryType::Host; + property.query_memory_type = MemoryType::Device; return property; } void save(const std::string& file) const override; void load(const std::string&) override; - ~RaftIvfPQ() noexcept { rmm::mr::set_current_device_resource(mr_.get_upstream()); } - private: + // `mr_` must go first to make sure it dies last + rmm::mr::pool_memory_resource mr_; raft::device_resources handle_; BuildParam index_params_; raft::neighbors::ivf_pq::search_params search_params_; @@ -93,24 +102,8 @@ class RaftIvfPQ : public ANN { int device_; int dimension_; float refine_ratio_ = 1.0; - MemoryType dataset_memtype_; - rmm::mr::pool_memory_resource mr_; raft::device_matrix_view dataset_; }; -template -RaftIvfPQ::RaftIvfPQ( - Metric metric, int dim, const BuildParam& param, float refine_ratio, MemoryType dataset_memtype) - : ANN(metric, dim), - index_params_(param), - dimension_(dim), - refine_ratio_(refine_ratio), - dataset_memtype_(dataset_memtype), - mr_(rmm::mr::get_current_device_resource(), 1024 * 1024 * 1024ull) -{ - rmm::mr::set_current_device_resource(&mr_); - index_params_.metric = parse_metric_type(metric); - RAFT_CUDA_TRY(cudaGetDevice(&device_)); -} template void RaftIvfPQ::save(const std::string& file) const @@ -141,6 +134,7 @@ void RaftIvfPQ::set_search_param(const AnnSearchParam& param) { auto search_param = dynamic_cast(param); search_params_ = search_param.pq_param; + refine_ratio_ = search_param.refine_ratio; assert(search_params_.n_probes <= index_params_.n_lists); } @@ -168,7 +162,8 @@ void RaftIvfPQ::search(const T* queries, raft::runtime::neighbors::ivf_pq::search( handle_, search_params_, *index_, queries_v, candidates.view(), distances_tmp.view()); - if (get_property().dataset_memory_type == MemoryType::Device) { + if (raft::spatial::knn::detail::utils::check_pointer_residency(dataset_.data_handle()) == + raft::spatial::knn::detail::utils::pointer_residency::device_only) { auto queries_v = raft::make_device_matrix_view(queries, batch_size, index_->dim()); auto neighbors_v = raft::make_device_matrix_view((IdxT*)neighbors, batch_size, k); diff --git a/cpp/bench/ann/src/raft/raft_wrapper.h b/cpp/bench/ann/src/raft/raft_wrapper.h index 01f206ab70..c8d98460b7 100644 --- a/cpp/bench/ann/src/raft/raft_wrapper.h +++ b/cpp/bench/ann/src/raft/raft_wrapper.h @@ -65,12 +65,11 @@ class RaftGpu : public ANN { cudaStream_t stream = 0) const final; // to enable dataset access from GPU memory - AlgoProperty get_property() const override + AlgoProperty get_preference() const override { AlgoProperty property; - property.dataset_memory_type = MemoryType::Device; - property.query_memory_type = MemoryType::Device; - property.need_dataset_when_search = true; + property.dataset_memory_type = MemoryType::Device; + property.query_memory_type = MemoryType::Device; return property; } void set_search_dataset(const T* dataset, size_t nrow) override; diff --git a/docs/source/ann_benchmarks_low_level.md b/docs/source/ann_benchmarks_low_level.md index f95d01f66f..d7cc2a3310 100644 --- a/docs/source/ann_benchmarks_low_level.md +++ b/docs/source/ann_benchmarks_low_level.md @@ -21,126 +21,204 @@ mv glove-100-angular.groundtruth.distances.fbin glove-100-inner/groundtruth.dist popd # (2) build index -./cpp/build/RAFT_IVF_FLAT_ANN_BENCH -b -i raft_ivf_flat.nlist1024 conf/glove-100-inner.json +./cpp/build/RAFT_IVF_FLAT_ANN_BENCH \ + --data_prefix=cpp/bench/ann/data \ + --build \ + --benchmark_filter="raft_ivf_flat\..*" \ + cpp/bench/ann/conf/glove-100-inner.json # (3) search -./cpp/build/RAFT_IVF_FLAT_ANN_BENCH -s -i raft_ivf_flat.nlist1024 conf/glove-100-inner.json - -# (4) evaluate result -pushd -cd cpp/bench/ann -./scripts/eval.pl \ - -o result.csv \ - data/glove-100-inner/groundtruth.neighbors.ibin \ - result/glove-100-inner/faiss_ivf_flat -popd - -# optional step: plot QPS-Recall figure using data in result.csv with your favorite tool +./cpp/build/RAFT_IVF_FLAT_ANN_BENCH \ + --data_prefix=cpp/bench/ann/data \ + --benchmark_min_time=2s \ + --benchmark_out=ivf_flat_search.csv \ + --benchmark_out_format=csv \ + --benchmark_counters_tabular \ + --search \ + --benchmark_filter="raft_ivf_flat\..*" + cpp/bench/ann/conf/glove-100-inner.json + +# optional step: plot QPS-Recall figure using data in ivf_flat_search.csv with your favorite tool ``` -##### Step 1: Prepare Dataset -[Instructions](ann_benchmarks_dataset.md) +##### Step 1: Prepare Dataset +A dataset usually has 4 binary files containing database vectors, query vectors, ground truth neighbors and their corresponding distances. For example, Glove-100 dataset has files `base.fbin` (database vectors), `query.fbin` (query vectors), `groundtruth.neighbors.ibin` (ground truth neighbors), and `groundtruth.distances.fbin` (ground truth distances). The first two files are for index building and searching, while the other two are associated with a particular distance and are used for evaluation. + +The file suffixes `.fbin`, `.f16bin`, `.ibin`, `.u8bin`, and `.i8bin` denote that the data type of vectors stored in the file are `float32`, `float16`(a.k.a `half`), `int`, `uint8`, and `int8`, respectively. +These binary files are little-endian and the format is: the first 8 bytes are `num_vectors` (`uint32_t`) and `num_dimensions` (`uint32_t`), and the following `num_vectors * num_dimensions * sizeof(type)` bytes are vectors stored in row-major order. + +Some implementation can take `float16` database and query vectors as inputs and will have better performance. Use `script/fbin_to_f16bin.py` to transform dataset from `float32` to `float16` type. + +Commonly used datasets can be downloaded from two websites: +1. Million-scale datasets can be found at the [Data sets](https://github.com/erikbern/ann-benchmarks#data-sets) section of [`ann-benchmarks`](https://github.com/erikbern/ann-benchmarks). + + However, these datasets are in HDF5 format. Use `cpp/bench/ann/scripts/hdf5_to_fbin.py` to transform the format. A few Python packages are required to run it: + ```bash + pip3 install numpy h5py + ``` + The usage of this script is: + ```bash + $ cpp/bench/ann/scripts/hdf5_to_fbin.py + usage: scripts/hdf5_to_fbin.py [-n] .hdf5 + -n: normalize base/query set + outputs: .base.fbin + .query.fbin + .groundtruth.neighbors.ibin + .groundtruth.distances.fbin + ``` + So for an input `.hdf5` file, four output binary files will be produced. See previous section for an example of prepossessing GloVe dataset. + + Most datasets provided by `ann-benchmarks` use `Angular` or `Euclidean` distance. `Angular` denotes cosine distance. However, computing cosine distance reduces to computing inner product by normalizing vectors beforehand. In practice, we can always do the normalization to decrease computation cost, so it's better to measure the performance of inner product rather than cosine distance. The `-n` option of `hdf5_to_fbin.py` can be used to normalize the dataset. + +2. Billion-scale datasets can be found at [`big-ann-benchmarks`](http://big-ann-benchmarks.com). The ground truth file contains both neighbors and distances, thus should be split. A script is provided for this: + ```bash + $ cpp/bench/ann/scripts/split_groundtruth.pl + usage: script/split_groundtruth.pl input output_prefix + ``` + Take Deep-1B dataset as an example: + ```bash + pushd + cd cpp/bench/ann + mkdir -p data/deep-1B && cd data/deep-1B + # download manually "Ground Truth" file of "Yandex DEEP" + # suppose the file name is deep_new_groundtruth.public.10K.bin + ../../scripts/split_groundtruth.pl deep_new_groundtruth.public.10K.bin groundtruth + # two files 'groundtruth.neighbors.ibin' and 'groundtruth.distances.fbin' should be produced + popd + ``` + Besides ground truth files for the whole billion-scale datasets, this site also provides ground truth files for the first 10M or 100M vectors of the base sets. This mean we can use these billion-scale datasets as million-scale datasets. To facilitate this, an optional parameter `subset_size` for dataset can be used. See the next step for further explanation. ##### Step 2: Build Index -An index is a data structure to facilitate searching. Different algorithms may use different data structures for their index. We can use `RAFT_IVF_FLAT_ANN_BENCH -b` to build an index and save it to disk. +An index is a data structure to facilitate searching. Different algorithms may use different data structures for their index. We can use `RAFT_IVF_FLAT_ANN_BENCH --build` to build an index and save it to disk. To run a benchmark executable, like `RAFT_IVF_FLAT_ANN_BENCH`, a JSON configuration file is required. Refer to [`cpp/bench/ann/conf/glove-100-inner.json`](../../cpp/cpp/bench/ann/conf/glove-100-inner.json) as an example. Configuration file has 3 sections: * `dataset` section specifies the name and files of a dataset, and also the distance in use. Since the `*_ANN_BENCH` programs are for index building and searching, only `base_file` for database vectors and `query_file` for query vectors are needed. Ground truth files are for evaluation thus not needed. - To use only a subset of the base dataset, an optional parameter `subset_size` can be specified. It means using only the first `subset_size` vectors of `base_file` as the base dataset. * `search_basic_param` section specifies basic parameters for searching: - `k` is the "k" in "k-nn", that is, the number of neighbors (or results) we want from the searching. - - `run_count` means how many times we run the searching. A single run of searching will search neighbors for all vectors in `test` set. The total time used for a run is recorded, and the final searching time is the smallest one among these runs. * `index` section specifies an array of configurations for index building and searching: - `build_param` and `search_params` are parameters for building and searching, respectively. `search_params` is an array since we will search with different parameters to get different recall values. - `file` is the file name of index. Building will save built index to this file, while searching will load this file. - - `search_result_file` is the file name prefix of searching results. Searching will save results to these files, and plotting script will read these files to plot results. Note this is a prefix rather than a whole file name. Suppose its value is `${prefix}`, then the real file names are like `${prefix}.0.{ibin|txt}`, `${prefix}.1.{ibin|txt}`, etc. Each of them corresponds to an item in `search_params` array. That is, for one searching parameter, there will be some corresponding search result files. - if `multigpu` is specified, multiple GPUs will be used for index build and search. - if `refine_ratio` is specified, refinement, as a post-processing step of search, will be done. It's for algorithms that compress vectors. For example, if `"refine_ratio" : 2` is set, 2`k` results are first computed, then exact distances of them are computed using original uncompressed vectors, and finally top `k` results among them are kept. -The usage of `*_ANN_BENCH` can be found by running `*_ANN_BENCH -h` on one of the executables: +The usage of `*_ANN_BENCH` can be found by running `*_ANN_BENCH --help` on one of the executables: ```bash -$ ./cpp/build/*_ANN_BENCH -h -usage: ./cpp/build/*_ANN_BENCH -b|s [-f] [-i index_names] conf.json - -b: build mode, will build index - -s: search mode, will search using built index - one and only one of -b and -s should be specified - -f: force overwriting existing output files - -i: by default will build/search all the indices found in conf.json - '-i' can be used to select a subset of indices - 'index_names' is a list of comma-separated index names - '*' is allowed as the last character of a name to select all matched indices - for example, -i "hnsw1,hnsw2,faiss" or -i "hnsw*,faiss" -``` -* `-b`: build index. -* `-s`: do the searching with built index. -* `-f`: before doing the real task, the program checks that needed input files exist and output files don't exist. If these conditions are not met, it quits so no file would be overwritten accidentally. To ignore existing output files and force overwrite them, use the `-f` option. -* `-i`: by default, the `-b` flag will build all indices found in the configuration file, and `-s` will search using all the indices. To select a subset of indices to build or search, we can use the `-i` option. - -It's easier to describe the usage of `-i` option with an example. Suppose we have a configuration file `a.json`, and it contains: -```json - "index" : [ - { - "name" : "hnsw1", - ... - }, - { - "name" : "hnsw1", - ... - }, - { - "name" : "faiss", - ... - } - ] +$ ./cpp/build/*_ANN_BENCH --help +benchmark [--benchmark_list_tests={true|false}] + [--benchmark_filter=] + [--benchmark_min_time=`x` OR `s` ] + [--benchmark_min_warmup_time=] + [--benchmark_repetitions=] + [--benchmark_enable_random_interleaving={true|false}] + [--benchmark_report_aggregates_only={true|false}] + [--benchmark_display_aggregates_only={true|false}] + [--benchmark_format=] + [--benchmark_out=] + [--benchmark_out_format=] + [--benchmark_color={auto|true|false}] + [--benchmark_counters_tabular={true|false}] + [--benchmark_context==,...] + [--benchmark_time_unit={ns|us|ms|s}] + [--v=] + [--build|--search] + [--overwrite] + [--data_prefix=] + .json + +Note the non-standard benchmark parameters: + --build: build mode, will build index + --search: search mode, will search using the built index + one and only one of --build and --search should be specified + --overwrite: force overwriting existing index files + --data_prefix=: prepend to dataset file paths specified in the .json. + --override_kv=: override a build/search key one or more times multiplying the number of configurations; you can use this parameter multiple times to get the Cartesian product of benchmark configs. ``` -Then, -```bash -# build all indices: hnsw1, hnsw2 and faiss -./cpp/build/HNSWLIB_ANN_BENCH -b a.json - -# build only hnsw1 -./cpp/build/HNSWLIB_ANN_BENCH -b -i hnsw1 a.json +* `--build`: build index. +* `--search`: do the searching with built index. +* `--overwrite`: by default, the building mode skips building an index if it find out it already exists. This is useful when adding more configurations to the config; only new indices are build without the need to specify an elaborate filtering regex. By supplying `overwrite` flag, you disable this behavior; all indices are build regardless whether they are already stored on disk. +* `--data_prefix`: prepend an arbitrary path to the data file paths. By default, it is equal to `data`. Note, this does not apply to index file paths. +* `--override_kv`: override a build/search key one or more times multiplying the number of configurations. -# build hnsw1 and hnsw2 -./cpp/build/HNSWLIB_ANN_BENCH -b -i hnsw1,hnsw2 a.json - -# build hnsw1 and hnsw2 -./cpp/build/HNSWLIB_ANN_BENCH -b -i 'hnsw*' a.json - -# build faiss -./cpp/build/FAISS_IVF_FLAT_ANN_BENCH -b -i 'faiss' a.json -``` -In the last two commands, we use wildcard "`*`" to match both `hnsw1` and `hnsw2`. Note the use of "`*`" is quite limited. It can occur only at the end of a pattern, so both "`*nsw1`" and "`h*sw1`" are interpreted literally and will not match anything. Also note that quotation marks must be used to prevent "`*`" from being interpreted by the shell. +In addition to these ANN-specific flags, you can use all of the standard google benchmark flags. Some of the useful flags: +* `--benchmark_filter`: specify subset of benchmarks to run +* `--benchmark_out`, `--benchmark_out_format`: store the output to a file +* `--benchmark_list_tests`: check the available configurations +* `--benchmark_min_time`: specify the minimum duration or number of iterations per case to improve accuracy of the benchmarks. +Refer to the google benchmark [user guide](https://github.com/google/benchmark/blob/main/docs/user_guide.md#command-line) for more information about the command-line usage. ##### Step 3: Searching -Use the `-s` flag on any of the `*_ANN_BENCH` executables. Other options are the same as in step 2. - - -##### Step 4: Evaluating Results -Use `cpp/bench/ann/scripts/eval.pl` to evaluate benchmark results. The usage is: -```bash -$ cpp/bench/ann/scripts/eval.pl -usage: [-f] [-o output.csv] groundtruth.neighbors.ibin result_paths... - result_paths... are paths to the search result files. - Can specify multiple paths. - For each of them, if it's a directory, all the .txt files found under - it recursively will be regarded as inputs. - - -f: force to recompute recall and update it in result file if needed - -o: also write result to a csv file +Use the `--search` flag on any of the `*_ANN_BENCH` executables. Other options are the same as in step 2. + +## Adding a new ANN algorithm +Implementation of a new algorithm should be a class that inherits `class ANN` (defined in `cpp/bench/ann/src/ann.h`) and implements all the pure virtual functions. + +In addition, it should define two `struct`s for building and searching parameters. The searching parameter class should inherit `struct ANN::AnnSearchParam`. Take `class HnswLib` as an example, its definition is: +```c++ +template +class HnswLib : public ANN { +public: + struct BuildParam { + int M; + int ef_construction; + int num_threads; + }; + + using typename ANN::AnnSearchParam; + struct SearchParam : public AnnSearchParam { + int ef; + int num_threads; + }; + + // ... +}; ``` -Note that there can be multiple arguments for paths of result files. Each argument can be either a file name or a path. If it's a directory, all files found under it recursively will be used as input files. -An example: -```bash -cpp/bench/ann/scripts/eval.pl groundtruth.neighbors.ibin \ - result/glove-100-angular/10/hnsw/angular_M_24_*.txt \ - result/glove-100-angular/10/faiss/ + +The benchmark program uses JSON configuration file. To add the new algorithm to the benchmark, need be able to specify `build_param`, whose value is a JSON object, and `search_params`, whose value is an array of JSON objects, for this algorithm in configuration file. Still take the configuration for `HnswLib` as an example: +```json +{ + "name" : "...", + "algo" : "hnswlib", + "build_param": {"M":12, "efConstruction":500, "numThreads":32}, + "file" : "/path/to/file", + "search_params" : [ + {"ef":10, "numThreads":1}, + {"ef":20, "numThreads":1}, + {"ef":40, "numThreads":1} + ] +}, ``` -The search result files used by this command are files matching `result/glove-100-angular/10/hnsw/angular_M_24_*.txt`, and all `.txt` files under directory `result/glove-100-angular/10/faiss/` recursively. -This script prints recall and QPS for every result file. Also, it outputs estimated "recall at QPS=2000" and "QPS at recall=0.9", which can be used to compare performance quantitatively. +How to interpret these JSON objects is totally left to the implementation and should be specified in `cpp/bench/ann/src/factory.cuh`: +1. First, add two functions for parsing JSON object to `struct BuildParam` and `struct SearchParam`, respectively: + ```c++ + template + void parse_build_param(const nlohmann::json& conf, + typename cuann::HnswLib::BuildParam& param) { + param.ef_construction = conf.at("efConstruction"); + param.M = conf.at("M"); + if (conf.contains("numThreads")) { + param.num_threads = conf.at("numThreads"); + } + } -It saves recall value in result txt file, so avoids to recompute recall if the same command is run again. To force to recompute recall, option `-f` can be used. If option `-o ` is specified, a csv output file will be produced. This file can be used to plot Throughput-Recall curves. + template + void parse_search_param(const nlohmann::json& conf, + typename cuann::HnswLib::SearchParam& param) { + param.ef = conf.at("ef"); + if (conf.contains("numThreads")) { + param.num_threads = conf.at("numThreads"); + } + } + ``` + +2. Next, add corresponding `if` case to functions `create_algo()` and `create_search_param()` by calling parsing functions. The string literal in `if` condition statement must be the same as the value of `algo` in configuration file. For example, + ```c++ + // JSON configuration file contains a line like: "algo" : "hnswlib" + if (algo == "hnswlib") { + // ... + } + ```