Skip to content

Commit

Permalink
add tuning guide for cagra, modify build param
Browse files Browse the repository at this point in the history
  • Loading branch information
divyegala committed Aug 25, 2023
1 parent 0cf1c6f commit 617c60f
Show file tree
Hide file tree
Showing 14 changed files with 37 additions and 28 deletions.
4 changes: 2 additions & 2 deletions bench/ann/conf/bigann-100M.json
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@
"name": "raft_cagra.dim32",
"algo": "raft_cagra",
"dataset_memtype": "host",
"build_param": {"index_dim": 32},
"build_param": {"graph_degree": 32},
"file": "bigann-100M/raft_cagra/dim32",
"search_params": [
{"itopk": 32},
Expand All @@ -184,7 +184,7 @@
"name": "raft_cagra.dim64",
"algo": "raft_cagra",
"dataset_memtype":"host",
"build_param": {"index_dim": 64},
"build_param": {"graph_degree": 64},
"file": "bigann-100M/raft_cagra/dim64",
"search_params": [
{"itopk": 32},
Expand Down
8 changes: 4 additions & 4 deletions bench/ann/conf/deep-100M.json
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@
"name": "raft_cagra.dim32",
"algo": "raft_cagra",
"dataset_memtype":"host",
"build_param": {"index_dim": 32, "intermediate_graph_degree": 48},
"build_param": {"graph_degree": 32, "intermediate_graph_degree": 48},
"file": "deep-100M/raft_cagra/dim32",
"search_params": [
{"itopk": 32, "search_width": 1, "max_iterations": 0, "algo": "single_cta"},
Expand All @@ -241,7 +241,7 @@
"name": "raft_cagra.dim32.multi_cta",
"algo": "raft_cagra",
"dataset_memtype":"host",
"build_param": {"index_dim": 32, "intermediate_graph_degree": 48},
"build_param": {"graph_degree": 32, "intermediate_graph_degree": 48},
"file": "deep-100M/raft_cagra/dim32",
"search_params": [
{"itopk": 32, "search_width": 1, "max_iterations": 0, "algo": "multi_cta"},
Expand All @@ -262,7 +262,7 @@
"name": "raft_cagra.dim32.multi_kernel",
"algo": "raft_cagra",
"dataset_memtype":"host",
"build_param": {"index_dim": 32, "intermediate_graph_degree": 48},
"build_param": {"graph_degree": 32, "intermediate_graph_degree": 48},
"file": "deep-100M/raft_cagra/dim32",
"search_params": [
{"itopk": 32, "search_width": 1, "max_iterations": 0, "algo": "multi_kernel"},
Expand All @@ -283,7 +283,7 @@
"name": "raft_cagra.dim64",
"algo": "raft_cagra",
"dataset_memtype":"host",
"build_param": {"index_dim": 64},
"build_param": {"graph_degree": 64},
"file": "deep-100M/raft_cagra/dim64",
"search_params": [
{"itopk": 32, "search_width": 1, "max_iterations": 0},
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/deep-image-96-angular.json
Original file line number Diff line number Diff line change
Expand Up @@ -992,7 +992,7 @@
"algo" : "raft_cagra",
"dataset_memtype": "device",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/deep-image-96-angular/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1008,7 +1008,7 @@
"algo" : "raft_cagra",
"dataset_memtype": "device",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/deep-image-96-angular/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/fashion-mnist-784-euclidean.json
Original file line number Diff line number Diff line change
Expand Up @@ -1336,7 +1336,7 @@
"algo" : "raft_cagra",
"dataset_memtype": "device",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/fashion-mnist-784-euclidean/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1352,7 +1352,7 @@
"algo" : "raft_cagra",
"dataset_memtype": "device",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/fashion-mnist-784-euclidean/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/gist-960-euclidean.json
Original file line number Diff line number Diff line change
Expand Up @@ -1322,7 +1322,7 @@
"name" : "raft_cagra.dim32",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/gist-960-euclidean/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1337,7 +1337,7 @@
"name" : "raft_cagra.dim64",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/gist-960-euclidean/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/glove-100-angular.json
Original file line number Diff line number Diff line change
Expand Up @@ -1322,7 +1322,7 @@
"name" : "raft_cagra.dim32",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/glove-100-angular/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1337,7 +1337,7 @@
"name" : "raft_cagra.dim64",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/glove-100-angular/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/glove-50-angular.json
Original file line number Diff line number Diff line change
Expand Up @@ -1322,7 +1322,7 @@
"name" : "raft_cagra.dim32",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/glove-50-angular/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1337,7 +1337,7 @@
"name" : "raft_cagra.dim64",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/glove-50-angular/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/lastfm-65-angular.json
Original file line number Diff line number Diff line change
Expand Up @@ -1322,7 +1322,7 @@
"name" : "raft_cagra.dim32",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/lastfm-65-angular/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1337,7 +1337,7 @@
"name" : "raft_cagra.dim64",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/lastfm-65-angular/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/mnist-784-euclidean.json
Original file line number Diff line number Diff line change
Expand Up @@ -1322,7 +1322,7 @@
"name" : "raft_cagra.dim32",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/mnist-784-euclidean/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1337,7 +1337,7 @@
"name" : "raft_cagra.dim64",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/mnist-784-euclidean/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/nytimes-256-angular.json
Original file line number Diff line number Diff line change
Expand Up @@ -1322,7 +1322,7 @@
"name" : "raft_cagra.dim32",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 32
"graph_degree" : 32
},
"file" : "index/nytimes-256-angular/raft_cagra/dim32",
"search_params" : [
Expand All @@ -1337,7 +1337,7 @@
"name" : "raft_cagra.dim64",
"algo" : "raft_cagra",
"build_param": {
"index_dim" : 64
"graph_degree" : 64
},
"file" : "index/nytimes-256-angular/raft_cagra/dim64",
"search_params" : [
Expand Down
4 changes: 2 additions & 2 deletions bench/ann/conf/sift-128-euclidean.json
Original file line number Diff line number Diff line change
Expand Up @@ -475,7 +475,7 @@
{
"name": "raft_cagra.dim32",
"algo": "raft_cagra",
"build_param": {"index_dim": 32},
"build_param": {"graph_degree": 32},
"file": "sift-128-euclidean/raft_cagra/dim32",
"search_params": [
{"itopk": 32},
Expand All @@ -486,7 +486,7 @@
{
"name": "raft_cagra.dim64",
"algo": "raft_cagra",
"build_param": {"index_dim": 64},
"build_param": {"graph_degree": 64},
"file": "sift-128-euclidean/raft_cagra/dim64",
"search_params": [
{"itopk": 32},
Expand Down
4 changes: 2 additions & 2 deletions cpp/bench/ann/src/raft/raft_benchmark.cu
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,8 @@ template <typename T, typename IdxT>
void parse_build_param(const nlohmann::json& conf,
typename raft::bench::ann::RaftCagra<T, IdxT>::BuildParam& param)
{
if (conf.contains("index_dim")) {
param.graph_degree = conf.at("index_dim");
if (conf.contains("graph_degree")) {
param.graph_degree = conf.at("graph_degree");
param.intermediate_graph_degree = param.graph_degree * 2;
}
if (conf.contains("intermediate_graph_degree")) {
Expand Down
10 changes: 10 additions & 0 deletions docs/source/ann_benchmarks_param_tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,16 @@ IVF-pq is an inverted-file index, which partitions the vectors into a series of


### `raft_cagra`
CAGRA uses a graph-based index, which creates an intermediate, approximate kNN graph using IVF-PQ and then further refining and optimizing to create a final kNN graph. This kNN graph is used by CAGRA as an index for search.

| Parameter | Type | Required | Data Type | Default | Description |
|-----------|----------------|----------|---------------------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `graph_degree` | `build_param` | N | Positive Integer >0 | 64 | Degree of the final kNN graph index. |
| `intermediate_graph_degree` | `build_param` | N | Positive Integer >0 | 128 | Degree of the intermediate kNN graph. |
| `itopk` | `search_wdith` | N | Positive Integer >0 | 64 | Number of intermediate search results retained during the search. Higher values improve search accuracy at the cost of speed. |
| `search_width` | `search_param` | N | Positive Integer >0 | 1 | Number of graph nodes to select as the starting point for the search in each iteration. |
| `max_iterations` | `search_param` | N | Integer >=0 | 0 | Upper limit of search iterations. Auto select when 0. |
| `algo` | `search_param` | N | string | "auto" | Algorithm to use for search. Possible values: {"auto", "single_cta", "multi_cta", "multi_kernel"} |


## FAISS Indexes
Expand Down
3 changes: 1 addition & 2 deletions docs/source/raft_ann_benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,11 +118,10 @@ options:
--dataset-path DATASET_PATH
path to download dataset (default: ${RAFT_HOME}/bench/ann/data)
--normalize normalize cosine distance to inner product (default: False)

```
When option `normalize` is provided to the script, any dataset that has cosine distances
will be normalized to inner product. So, for example, the dataset `glove-100-angular`
will be written at location `${RAFT_HOME}/bench/ann/data/glove-100-inner/`.
```

#### Step 2: Build and Search Index
The script `bench/ann/run.py` will build and search indices for a given dataset and its
Expand Down

0 comments on commit 617c60f

Please sign in to comment.