Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to use Chroma vectorstore with default embedding_function results in an error #18291

Closed
5 tasks done
VladMstv opened this issue Feb 29, 2024 · 2 comments · Fixed by #19277
Closed
5 tasks done
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: chroma Primarily related to ChromaDB integrations Ɑ: vector store Related to vector store module

Comments

@VladMstv
Copy link

VladMstv commented Feb 29, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Using this code gives the first type of exception "You must provide an embedding function to compute embeddings."

import chromadb
from chromadb.utils import embedding_functions
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import Chroma

metadata_field_info = [
    AttributeInfo(
        name="date",
        description="Date description",
        type="integer",
    ),
]
document_content_description = "Content description"
llm = ChatOpenAI(temperature=0, key)


class ChromaDbInstance:
    def __init__(self) -> None:
        self._client = chromadb.HttpClient(
            host=f"http://localhost:{CHROMA_DB_PORT}",
        )
        self._collection_texts = self._client.get_or_create_collection(
            name=CHROMA_MAIN_COLLECTION_NAME
        )
        self._retriever = SelfQueryRetriever.from_llm(
            llm,
            Chroma(
                client=self._client,
                collection_name=CHROMA_MAIN_COLLECTION_NAME
            ),
            document_content_description,
            metadata_field_info,
        )

    @property
    def count(self):
        return self._collection_texts.count()

    def add_text_to_db(self, text):
        try:
            self._collection_texts.add(
                documents=[text], ids=[str(self.count + 1)]]
            )
        except Exception as e:
            print("-- Error adding new text to db --", e)

    def query_db(self, query):
        return self._retriever.invoke(query)

If I pass the "embedding_function" to Chroma initialization - I get another error: "AttributeError: 'ONNXMiniLM_L6_V2' object has no attribute 'embed_query'"

Chroma(
    client=self._client,
    collection_name=CHROMA_MAIN_COLLECTION_NAME,
    embedding_function=embedding_functions.DefaultEmbeddingFunction(),
),

Error Message and Stack Trace (if applicable)

Initial case when not providing any embedding_function to langchain_community.vectorstores.Chroma:

Traceback (most recent call last):
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/server.py", line 20, in echo_endpoint
    return {"response": db.query_db(req.prompt)}
                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/modules/vector_db.py", line 72, in query_db
    return self._retriever.invoke(query)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/retrievers.py", line 141, in invoke
    return self.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/retrievers.py", line 244, in get_relevant_documents
    raise e
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/retrievers.py", line 237, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py", line 186, in _get_relevant_documents
    docs = self._get_docs_with_query(new_query, search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py", line 160, in _get_docs_with_query
    docs = self.vectorstore.search(query, self.search_type, **search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 159, in search
    return self.similarity_search(query, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 348, in similarity_search
    docs_and_scores = self.similarity_search_with_score(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 429, in similarity_search_with_score
    results = self.__query_collection(
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/utils/utils.py", line 35, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 155, in __query_collection
    return self._collection.query(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 327, in query
    valid_query_embeddings = self._embed(input=valid_query_texts)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 629, in _embed
    raise ValueError(
ValueError: You must provide an embedding function to compute embeddings.https://docs.trychroma.com/embeddings

Case when providing the default embedding_function to the Chroma initializer

Traceback (most recent call last):
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/server.py", line 20, in echo_endpoint
    return {"response": db.query_db(req.prompt)}
                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/modules/vector_db.py", line 75, in query_db
    return self._retriever.invoke(query)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/retrievers.py", line 141, in invoke
    return self.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/retrievers.py", line 244, in get_relevant_documents
    raise e
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/retrievers.py", line 237, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py", line 186, in _get_relevant_documents
    docs = self._get_docs_with_query(new_query, search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py", line 160, in _get_docs_with_query
    docs = self.vectorstore.search(query, self.search_type, **search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 159, in search
    return self.similarity_search(query, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 348, in similarity_search
    docs_and_scores = self.similarity_search_with_score(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/heithvald/Documents/development/projects/project21/python/.venv/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 437, in similarity_search_with_score
    query_embedding = self._embedding_function.embed_query(query)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'ONNXMiniLM_L6_V2' object has no attribute 'embed_query'

Description

  • I'm trying to use a SelfQueryRetriever with Chroma vector store.
  • I expect it to work without passing the embedding_function arg, or when I pass it explicitly embedding_function=embedding_functions.DefaultEmbeddingFunction() to the Chroma constructor
  • Instead I get errors when trying to call retriever.invoke(text)

I've debugged and found out the problem is most likely in this line: https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/chroma.py#L128 line
If nothing was passed to the embedding_function - it would initialize normally and just query the chroma collection and inside the collection it will use the right methods for the embedding_function inside the chromadb lib source code: return self._embedding_function(input=input). At least it will work for the default embedding_function provided by chromadb. Please, fix it.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:24 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000
Python Version: 3.11.8 (v3.11.8:db85d51d3e, Feb 6 2024, 18:02:37) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.1.27
langchain: 0.1.9
langchain_community: 0.0.24
langsmith: 0.1.10
langchain_openai: 0.0.8
chromadb: 0.4.24

@dosubot dosubot bot added Ɑ: vector store Related to vector store module 🔌: chroma Primarily related to ChromaDB integrations 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Feb 29, 2024
eyurtsev pushed a commit that referenced this issue Mar 27, 2024
gkorland pushed a commit to FalkorDB/langchain that referenced this issue Mar 30, 2024
hinthornw pushed a commit that referenced this issue Apr 26, 2024
@lourot
Copy link

lourot commented Jul 15, 2024

This fix has been reverted because it introduced the regression #19848, so this problem is back. It's not possible to use DefaultEmbeddingFunction() in this context.

@widarlein
Copy link

Can confirm, getting this error still

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: chroma Primarily related to ChromaDB integrations Ɑ: vector store Related to vector store module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants