index() API does not respect batch_size on vector_store.add_documents() #19415
Labels
🤖:bug
Related to a bug, vulnerability, unexpected error with an existing feature
🔌: qdrant
Primarily related to Qdrant vector store integration
Ɑ: vector store
Related to vector store module
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
No response
Description
index()
API to support a record manager. I specify abatch_size
that is larger than the vector store's defaultbatch_size
on myindex()
call.batch_size
batch_size
parameter (Qdrant uses 64)Running the example code with
DOCUMENT_COUNT
set to 100, you would see two PUTs to Qdrant:Running the example code with
DOCUMENT_COUNT
set to 64, you would see one PUT to Qdrant:This is because the
batch_size
is not passed on calls tovector_store.add_documents()
, which itself callsadd_texts()
:(link)
As a result, the vector store implementation's default
batch_size
parameter is used instead:(link)
Suggested Fix
Update the the
vector_store.add_documents()
call inindex()
to includebatch_size=batch_size
:https://github.com/langchain-ai/langchain/blob/v0.1.13/libs/langchain/langchain/indexes/_api.py#L333
In doing so, the parameter is passed onward through
kwargs
to the finaladd_texts
calls.If you folks are good with this as a fix, I'm happy to open a PR (since this is my first issue on LangChain, I wanted to make sure I'm not barking up the wrong tree).
System Info
The text was updated successfully, but these errors were encountered: