Discussion: REST API for sharded indexes #2690

lintool · 2025-01-22T18:58:20Z

Starting a discussion of how we might design retrieval w/ sharded indexes. MS MARCO V2.1, for ArcticEmbed-L, we have 10 shards, from shard00 to shard09.

With the current design, once we close #2688 - we'll have:

http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard00.arctic-embed-l.hnsw-int8/
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard01.arctic-embed-l.hnsw-int8/
...
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard09.arctic-embed-l.hnsw-int8/

One simple solution would be to create a "fake" endpoint, e.g.,

http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented.arctic-embed-l.hnsw-int8/

That fans out to all the shards and gathers results.

@vincent-4 thoughts?

The text was updated successfully, but these errors were encountered:

vincent-4 · 2025-01-22T19:58:39Z

Thanks for starting!
Do you think it'd be easier to reuse the thread pool from `src/main/java/io/anserini/search/SearchHnswDenseVectors.java'? I get that it's for a single index? Or go from scratch. But I'm leaning towards the former

lintool · 2025-01-22T21:33:45Z

Actually, I'm thinking that each index would get its own separate underlying searcher instance with its own thread pool. So, all of these would be completely independent...

http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard00.arctic-embed-l.hnsw-int8/
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard01.arctic-embed-l.hnsw-int8/
...
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard09.arctic-embed-l.hnsw-int8/

Question is, how do we specify the configs? Do all indexes have the same config, e.g., say the setting is 4 threads - that'd mean all api/v1.0/indexes/xxx would get 4 threads. In which case, api/v1.0/indexes/msmarco-v2.1-doc-segmented.arctic-embed-l.hnsw-int8/ would actually trigger 40 threads in parallel - fans out to 10 shards, each shard fires up 4 threads. This is a simple model, but lacking fine-grained control...

lintool · 2025-01-25T20:55:38Z

I will write up in a guide shortly, but just to pass along. How to run TREC RAG24 test queries with ArcticEmbed-L shards:

SHARDS=(00 01 02 03 04 05 06 07 08 09); for shard in "${SHARDS[@]}"
do
    bin/run.sh io.anserini.search.SearchHnswDenseVectors -index msmarco-v2.1-doc-segmented-shard${shard}.arctic-embed-l.hnsw-int8 -efSearch 1000 -topics rag24.test.snowflake-arctic-embed-l -output runs/run.rag24.test.arctic-l-msv2.1.shard${shard}.txt -hits 250 -threads 32 > logs/log.run.rag24.test.arctic-l-msv2.1.shard${shard}.txt 2>&1
done

To evaluate:

cat runs/run.rag24.test.arctic-l-msv2.1.shard0* > runs/run.rag24.test.arctic-l-msv2.1.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.rag24.test-umbrela-all.txt runs/run.rag24.test.arctic-l-msv2.1.txt

On orca, a copy of the indexes are at `/mnt/msmarco-v2_1/indexes. Symlink to shared copy of indexes:

cd ~/.cache/pyserini/indexes/
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard00.arctic-embed-l.20250114.4884f5.aab3f8e9aa0563bd0f875584784a0845 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard01.arctic-embed-l.20250114.4884f5.34ea30fe72c2bc1795ae83e71b191547 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard02.arctic-embed-l.20250114.4884f5.b6271d6db65119977491675f74f466d5 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard03.arctic-embed-l.20250114.4884f5.a9cd644eb6037f67d2e9c06a8f60928d .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard04.arctic-embed-l.20250114.4884f5.07b7e451e0525d01c1f1f2b1c42b1bd5 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard05.arctic-embed-l.20250114.4884f5.2573dce175788981be2f266ebb33c96d .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard06.arctic-embed-l.20250114.4884f5.a644aea445a8b78cc9e99d2ce111ff11 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard07.arctic-embed-l.20250114.4884f5.402d37deccb44b5fc105049889e8aaea .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard08.arctic-embed-l.20250114.4884f5.89ebcd027f7297b26a1edc8ae5726527 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard09.arctic-embed-l.20250114.4884f5.5e580bb7eb9ee2bb6bfa492b3430c17d .

(Otherwise, it's ~0.5TB of downloads)

vincent-4 · 2025-02-02T23:58:10Z

I may have to update IndexInfo again to know when to call a shardedSearchService– e.g, a int shardList value: {123} so they could be appended to the name as needed in the call, eg doc-segmented + shard{0 ... 123}. Although it is getting pretty gnarly now, so maybe there's a better way to do it.

Otherwise, this is pretty straightforward, apart from shard index / shard settings / shard search configurations and their validations.

lintool · 2025-02-03T00:01:36Z

What about ShardInfo to parallel IndexInfo?

MSMARCO_V21_DOC_SEGMENTED_ARCTIC_EMBED_L_HNSW_INT8(
    "msmarco-v2.1-doc-segmented.arctic-embed-l.hnsw-int8",
    ....
    new IndexInfo[] {
        MSMARCO_V21_DOC_SEGMENTED_SHARD00_ARCTIC_EMBED_L_HNSW_INT8,
        MSMARCO_V21_DOC_SEGMENTED_SHARD01_ARCTIC_EMBED_L_HNSW_INT8
        ...
    }
    ...
)

vincent-4 · 2025-02-03T00:03:05Z

Wait, that's a good strategy. Thanks!

vincent-4 mentioned this issue Jan 26, 2025

ArcticEmbedLEncoder #2694

Merged

4 tasks

lintool changed the title ~~Discussion: sharded indexes~~ Discussion: REST API for sharded indexes Feb 1, 2025

lintool mentioned this issue Feb 1, 2025

Discussion: Command-line for sharded indexes #2705

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: REST API for sharded indexes #2690

Discussion: REST API for sharded indexes #2690

lintool commented Jan 22, 2025

vincent-4 commented Jan 22, 2025 •

edited

Loading

lintool commented Jan 22, 2025

lintool commented Jan 25, 2025

vincent-4 commented Feb 2, 2025 •

edited

Loading

lintool commented Feb 3, 2025

vincent-4 commented Feb 3, 2025

Discussion: REST API for sharded indexes #2690

Discussion: REST API for sharded indexes #2690

Comments

lintool commented Jan 22, 2025

vincent-4 commented Jan 22, 2025 • edited Loading

lintool commented Jan 22, 2025

lintool commented Jan 25, 2025

vincent-4 commented Feb 2, 2025 • edited Loading

lintool commented Feb 3, 2025

vincent-4 commented Feb 3, 2025

vincent-4 commented Jan 22, 2025 •

edited

Loading

vincent-4 commented Feb 2, 2025 •

edited

Loading