-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: REST API for sharded indexes #2690
Comments
Thanks for starting! |
Actually, I'm thinking that each index would get its own separate underlying searcher instance with its own thread pool. So, all of these would be completely independent...
Question is, how do we specify the configs? Do all indexes have the same config, e.g., say the setting is 4 threads - that'd mean all |
I will write up in a guide shortly, but just to pass along. How to run TREC RAG24 test queries with ArcticEmbed-L shards: SHARDS=(00 01 02 03 04 05 06 07 08 09); for shard in "${SHARDS[@]}"
do
bin/run.sh io.anserini.search.SearchHnswDenseVectors -index msmarco-v2.1-doc-segmented-shard${shard}.arctic-embed-l.hnsw-int8 -efSearch 1000 -topics rag24.test.snowflake-arctic-embed-l -output runs/run.rag24.test.arctic-l-msv2.1.shard${shard}.txt -hits 250 -threads 32 > logs/log.run.rag24.test.arctic-l-msv2.1.shard${shard}.txt 2>&1
done To evaluate: cat runs/run.rag24.test.arctic-l-msv2.1.shard0* > runs/run.rag24.test.arctic-l-msv2.1.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.rag24.test-umbrela-all.txt runs/run.rag24.test.arctic-l-msv2.1.txt On
(Otherwise, it's ~0.5TB of downloads) |
I may have to update IndexInfo again to know when to call a Otherwise, this is pretty straightforward, apart from shard index / shard settings / shard search configurations and their validations. |
What about
|
Wait, that's a good strategy. Thanks! |
Starting a discussion of how we might design retrieval w/ sharded indexes. MS MARCO V2.1, for ArcticEmbed-L, we have 10 shards, from
shard00
toshard09
.With the current design, once we close #2688 - we'll have:
One simple solution would be to create a "fake" endpoint, e.g.,
That fans out to all the shards and gathers results.
@vincent-4 thoughts?
The text was updated successfully, but these errors were encountered: