Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update Qdrant support post-refactor #1022

Merged
merged 1 commit into from
Feb 10, 2025

Conversation

jwm4
Copy link
Contributor

@jwm4 jwm4 commented Feb 9, 2025

What does this PR do?

I tried running the Qdrant provider and found some bugs. See #1021 for details. @terrytangyuan wrote there:

Please feel free to submit your changes in a PR. I fixed similar issues for pgvector provider. This might be an issue introduced from a refactoring.

So I am submitting this PR.

Closes #1021

Test Plan

Here are the highlights for what I did to test this:

References:

Install and run Qdrant server:

podman pull qdrant/qdrant
mkdir qdrant-data
podman run -p 6333:6333 -v $(pwd)/qdrant-data:/qdrant/storage qdrant/qdrant

Install and run Llama Stack from the venv-support PR (mainly because I didn't want to install conda):

brew install cmake # Should just need this once

git clone https://github.com/meta-llama/llama-models.git
gh repo clone cdoern/llama-stack
cd llama-stack
gh pr checkout 1018 # This is the checkout that introduces venv support for build/run.  Otherwise you have to use conda.  Eventually this wil be part of main, hopefully.

uv sync --extra dev
uv pip install -e .
source .venv/bin/activate
uv pip install qdrant_client

LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack build --template ollama --image-type venv
edit llama_stack/templates/ollama/run.yaml

in that editor under:

  vector_io:

add:

  - provider_id: qdrant
    provider_type: remote::qdrant
    config: {}

see https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/vector_io/qdrant/config.py#L14 for config options (but I didn't need any)

LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack run ollama --image-type venv \
   --port $LLAMA_STACK_PORT \
   --env INFERENCE_MODEL=$INFERENCE_MODEL \
   --env SAFETY_MODEL=$SAFETY_MODEL \
   --env OLLAMA_URL=$OLLAMA_URL

Then I tested it out in a notebook. Key highlights included:

qdrant_provider = None
for provider in client.providers.list():
    if provider.api == "vector_io" and provider.provider_id == "qdrant":
        qdrant_provider = provider
qdrant_provider
assert qdrant_provider is not None, "QDrant is not a provider.  You need to edit the run yaml file you use in your `llama stack run` call"

vector_db_id = f"test-vector-db-{uuid.uuid4().hex}"
client.vector_dbs.register(
    vector_db_id=vector_db_id,
    embedding_model="all-MiniLM-L6-v2",
    embedding_dimension=384,
    provider_id=qdrant_provider.provider_id,
)

Other than that, I just followed what was in https://llama-stack.readthedocs.io/en/latest/getting_started/index.html

It would be good to have automated tests for this in the future, but that would be a big undertaking.

Copy link
Collaborator

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Can you run tests/vector_io/test_vector_io.py and paste the results? See the top of the file for an example command (you'll need to modify the provider name and env var though).

@jwm4
Copy link
Contributor Author

jwm4 commented Feb 10, 2025

I am looking at https://github.com/jwm4/llama-stack/blob/main/tests/client-sdk/vector_io/test_vector_io.py and I don't see a command at the top of the file. However, this seems to work:

I replace all occurrences of "faiss" in "tests/client-sdk/vector_io/test_vector_io.py" with "qdrant". I still have Qdrant running and I relaunch the Llama Stack server. Then I run:

LLAMA_STACK_BASE_URL=http://localhost:5001 INFERENCE_MODEL=llama3.2:3b-instruct-fp16 pytest tests/client-sdk/vector_io/test_vector_io.py

And I get:

/Users/bmurdock/llamastack/cdoern-venv-supprt/llama-stack/.venv/lib/python3.13/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
========================================================================================================================================== test session starts ==========================================================================================================================================
platform darwin -- Python 3.13.1, pytest-8.3.4, pluggy-1.5.0
rootdir: /Users/bmurdock/llamastack/cdoern-venv-supprt/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.25.3, anyio-4.8.0, nbval-0.11.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None
collected 4 items

tests/client-sdk/vector_io/test_vector_io.py ....                                                                                                                                                                                                                                                 [100%]

===================================================================================================================================== 4 passed, 1 warning in 0.06s ======================================================================================================================================

Note the 4 passed, 1 warning in 0.06s in the end line.

@terrytangyuan
Copy link
Collaborator

@jwm4
Copy link
Contributor Author

jwm4 commented Feb 10, 2025

Ah, I had not seen there was another test directory! Anyway, I run

uv pip install pytest_html
pytest llama_stack/providers/tests/vector_io/test_vector_io.py -m "qdrant" --env EMBEDDING_DIMENSION=384 -v -s --tb=short --disable-warnings

and I get a bunch of errors of the form:

fixture 'vector_io_qdrant' not found

I will work on adding this fixture to llama_stack/providers/tests/vector_io/fixtures.py

@terrytangyuan
Copy link
Collaborator

Sounds good. Thanks!

@jwm4
Copy link
Contributor Author

jwm4 commented Feb 10, 2025

I haven't had much success so far. I will keep hacking away at it, but I wanted to post an update here.

I add the following fixture to llama_stack/providers/tests/vector_io/fixtures.py:

@pytest.fixture(scope="session")
def vector_io_qdrant() -> ProviderFixture:
    url = os.getenv("QDRANT_URL")
    if url:
        config = QdrantConfig(url=url)
        provider_type = "remote::qdrant"
    else:
        raise ValueError("QDRANT_URL must be set")
    return ProviderFixture(
        providers=[
            Provider(
                provider_id="qdrant",
                provider_type=provider_type,
                config=config.model_dump(),
            )
        ]
    )

I run and I get the following error:

.venv/lib/python3.13/site-packages/pytest_asyncio/plugin.py:375: in _async_fixture_wrapper
    result = event_loop.run_until_complete(setup_task)
../../../.local/share/uv/python/cpython-3.13.1-macos-aarch64-none/lib/python3.13/asyncio/base_events.py:720: in run_until_complete
    return future.result()
.venv/lib/python3.13/site-packages/pytest_asyncio/plugin.py:370: in setup
    res = await func(**_add_kwargs(func, kwargs, event_loop, request))
llama_stack/providers/tests/vector_io/fixtures.py:149: in vector_io_stack
    test_stack = await construct_stack_for_test(
llama_stack/providers/tests/resolver.py:67: in construct_stack_for_test
    impls = await construct_stack(run_config, get_provider_registry())
llama_stack/distribution/stack.py:202: in construct_stack
    impls = await resolve_impls(run_config, provider_registry or get_provider_registry(), dist_registry)
llama_stack/distribution/resolver.py:230: in resolve_impls
    impl = await instantiate_provider(
llama_stack/distribution/resolver.py:317: in instantiate_provider
    impl = await fn(*args)
llama_stack/providers/remote/inference/bedrock/__init__.py:14: in get_adapter_impl
    impl = BedrockInferenceAdapter(config)
llama_stack/providers/remote/inference/bedrock/bedrock.py:71: in __init__
    self._client = create_bedrock_client(config)
llama_stack/providers/utils/bedrock/client.py:72: in create_bedrock_client
    .refreshable_session()
llama_stack/providers/utils/bedrock/refreshable_boto_session.py:103: in refreshable_session
    metadata=self.__get_session_credentials(),
llama_stack/providers/utils/bedrock/refreshable_boto_session.py:85: in __get_session_credentials
    session_credentials = session.get_credentials().get_frozen_credentials()
E   AttributeError: 'NoneType' object has no attribute 'get_frozen_credentials'

So why do I get an error in Bedrock when the other vector DB tests do not? That seems to be because DEFAULT_PROVIDER_COMBINATIONS in conftest.py has:

    pytest.param(
        {
            "inference": "bedrock",
            "vector_io": "qdrant",
        },
        id="qdrant",
        marks=pytest.mark.qdrant,
    ),

None of the other vector DBs are paired with Bedrock. FWIW, I tried replacing this with ollama or sentence_transformers or fireworks which are used by other vector DBs and none of those worked either. ollama fails with:

ValueError: Model 'all-minilm:l6-v2' is not available in Ollama. Available models:

sentence_transformers fails with:

fixture 'inference_sentence_transformers' not found

and fireworks fails with:

E   Missing FIREWORKS_API_KEY in environment. Please set it using one of these methods:
E   1. Export in shell: export FIREWORKS_API_KEY=your-key
E   2. Create .env file in project root with: FIREWORKS_API_KEY=your-key
E   3. Pass directly to pytest: pytest --env FIREWORKS_API_KEY=your-key

I think I will try adding a fixture for sentence_transformers since it seems the closest to being viable, but I really don't understand the fundamentals of what these tests are doing (e.g., why were these pairs of inference and vector DBs chosen?). So I feel like I may be just thrashing around here.

@jwm4
Copy link
Contributor Author

jwm4 commented Feb 10, 2025

OK, I couldn't get anywhere with inference_sentence_transformers -- I added the fixture but then it just fails with ValueError: Provider 'inline::sentence_transformers' is not available for API 'Api.inference' which makes sense -- why would we expect sentence_transformers to do inference anyway? So I don't understand what the existing sentence_transformers/chroma is intended to do. So I went back to ollama as the inference provider and tried:

ollama pull all-minilm:l6-v2
curl http://localhost:11434/api/embeddings -d '{"model": "all-minilm", "prompt": "Hello world"}'

Now I get:

ValueError: Model 'all-minilm:l6-v2' is not available in Ollama. Available models: llama3.2:3b-instruct-fp16, all-minilm:latest

This seems like a defect since all-minilm:l6-v2 and all-minilm:latest seem to be synonyms. But I can work around it by running:

EMBEDDING_DIMENSION=384 QDRANT_URL=http://localhost pytest llama_stack/providers/tests/vector_io/test_vector_io.py -m "qdrant" -v -s --tb=short --embedding-model all-minilm:latest --disable-warnings

That gives me:

/Users/bmurdock/llamastack/cdoern-venv-supprt/llama-stack/.venv/lib/python3.13/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
==================================================================================================================================================================== test session starts ====================================================================================================================================================================
platform darwin -- Python 3.13.1, pytest-8.3.4, pluggy-1.5.0 -- /Users/bmurdock/llamastack/cdoern-venv-supprt/llama-stack/.venv/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.13.1', 'Platform': 'macOS-15.3-arm64-arm-64bit-Mach-O', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'html': '4.1.1', 'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0', 'nbval': '0.11.0'}}
rootdir: /Users/bmurdock/llamastack/cdoern-venv-supprt/llama-stack
configfile: pyproject.toml
plugins: html-4.1.1, metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0, nbval-0.11.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None
collected 18 items / 15 deselected / 3 selected

llama_stack/providers/tests/vector_io/test_vector_io.py::TestVectorIO::test_banks_list[-qdrant] key_fixture_dict[key] inference_ollama
key_fixture_dict[key] vector_io_qdrant
*** providers: {'inference': [Provider(provider_id='ollama', provider_type='remote::ollama', config={'url': 'http://localhost:11434'})], 'vector_io': [Provider(provider_id='qdrant', provider_type='remote::qdrant', config={'location': None, 'url': 'http://localhost', 'port': 6333, 'grpc_port': 6334, 'prefer_grpc': False, 'https': None, 'api_key': None, 'prefix': None, 'timeout': None, 'host': None, 'path': None})]}
*** provider_data: {}
PASSED
llama_stack/providers/tests/vector_io/test_vector_io.py::TestVectorIO::test_banks_register[-qdrant] PASSED
llama_stack/providers/tests/vector_io/test_vector_io.py::TestVectorIO::test_query_documents[-qdrant] The scores are: [0.25060147, 0.23422563, 0.22863364]
PASSED

======================================================================================================================================================= 3 passed, 15 deselected, 2 warnings in 1.02s ========================================================================================================================================================

So 3 tests pass and 15 deselected. @terrytangyuan , is that's what is intended here?

Copy link
Collaborator

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good for now. Thanks! Feel free to start a separate issue to track issues with running the tests.

@terrytangyuan terrytangyuan merged commit 3856927 into meta-llama:main Feb 10, 2025
6 checks passed
kaushik-himself pushed a commit to fiddlecube/llama-stack that referenced this pull request Feb 10, 2025
# What does this PR do?

I tried running the Qdrant provider and found some bugs. See meta-llama#1021 for
details. @terrytangyuan wrote there:

> Please feel free to submit your changes in a PR. I fixed similar
issues for pgvector provider. This might be an issue introduced from a
refactoring.

So I am submitting this PR.

Closes meta-llama#1021

## Test Plan

Here are the highlights for what I did to test this:

References:
-
https://llama-stack.readthedocs.io/en/latest/getting_started/index.html
-
https://github.com/meta-llama/llama-stack-apps/blob/main/examples/agents/rag_with_vector_db.py
-
https://github.com/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/README.md#build-configure-and-run-llama-stack

Install and run Qdrant server:

```
podman pull qdrant/qdrant
mkdir qdrant-data
podman run -p 6333:6333 -v $(pwd)/qdrant-data:/qdrant/storage qdrant/qdrant
```

Install and run Llama Stack from the venv-support PR (mainly because I
didn't want to install conda):

```
brew install cmake # Should just need this once

git clone https://github.com/meta-llama/llama-models.git
gh repo clone cdoern/llama-stack
cd llama-stack
gh pr checkout 1018 # This is the checkout that introduces venv support for build/run.  Otherwise you have to use conda.  Eventually this wil be part of main, hopefully.

uv sync --extra dev
uv pip install -e .
source .venv/bin/activate
uv pip install qdrant_client

LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack build --template ollama --image-type venv
```
```
edit llama_stack/templates/ollama/run.yaml
```

in that editor under:
```
  vector_io:
```
add:
```
  - provider_id: qdrant
    provider_type: remote::qdrant
    config: {}
```

see
https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/vector_io/qdrant/config.py#L14
for config options (but I didn't need any)

```
LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack run ollama --image-type venv \
   --port $LLAMA_STACK_PORT \
   --env INFERENCE_MODEL=$INFERENCE_MODEL \
   --env SAFETY_MODEL=$SAFETY_MODEL \
   --env OLLAMA_URL=$OLLAMA_URL
```

Then I tested it out in a notebook.  Key highlights included:

```
qdrant_provider = None
for provider in client.providers.list():
    if provider.api == "vector_io" and provider.provider_id == "qdrant":
        qdrant_provider = provider
qdrant_provider
assert qdrant_provider is not None, "QDrant is not a provider.  You need to edit the run yaml file you use in your `llama stack run` call"

vector_db_id = f"test-vector-db-{uuid.uuid4().hex}"
client.vector_dbs.register(
    vector_db_id=vector_db_id,
    embedding_model="all-MiniLM-L6-v2",
    embedding_dimension=384,
    provider_id=qdrant_provider.provider_id,
)
```

Other than that, I just followed what was in
https://llama-stack.readthedocs.io/en/latest/getting_started/index.html

It would be good to have automated tests for this in the future, but
that would be a big undertaking.

Signed-off-by: Bill Murdock <[email protected]>
hardikjshah pushed a commit that referenced this pull request Feb 13, 2025
# What does this PR do?

This is a follow on to #1022 . It includes the changes I needed to be
able to test the Qdrant support as requested by @terrytangyuan .

I uncovered a lot of bigger, more systemic issues with the vector DB
testing and I will open a new issue for those. For now, I am just
delivering the work I already did on that.

## Test Plan

As discussed on #1022:

```
podman pull qdrant/qdrant
mkdir qdrant-data
podman run -p 6333:6333 -v $(pwd)/qdrant-data:/qdrant/storage qdrant/qdrant
```


```
ollama pull all-minilm:l6-v2
curl http://localhost:11434/api/embeddings -d '{"model": "all-minilm", "prompt": "Hello world"}'
```

```
EMBEDDING_DIMENSION=384 QDRANT_URL=http://localhost pytest llama_stack/providers/tests/vector_io/test_vector_io.py -m "qdrant" -v -s --tb=short --embedding-model all-minilm:latest --disable-warnings
```

These show 3 tests passing and 15 deselected which is presumably working
as intended.

---------

Signed-off-by: Bill Murdock <[email protected]>
franciscojavierarceo pushed a commit to franciscojavierarceo/llama-stack that referenced this pull request Feb 14, 2025
# What does this PR do?

This is a follow on to meta-llama#1022 . It includes the changes I needed to be
able to test the Qdrant support as requested by @terrytangyuan .

I uncovered a lot of bigger, more systemic issues with the vector DB
testing and I will open a new issue for those. For now, I am just
delivering the work I already did on that.

## Test Plan

As discussed on meta-llama#1022:

```
podman pull qdrant/qdrant
mkdir qdrant-data
podman run -p 6333:6333 -v $(pwd)/qdrant-data:/qdrant/storage qdrant/qdrant
```


```
ollama pull all-minilm:l6-v2
curl http://localhost:11434/api/embeddings -d '{"model": "all-minilm", "prompt": "Hello world"}'
```

```
EMBEDDING_DIMENSION=384 QDRANT_URL=http://localhost pytest llama_stack/providers/tests/vector_io/test_vector_io.py -m "qdrant" -v -s --tb=short --embedding-model all-minilm:latest --disable-warnings
```

These show 3 tests passing and 15 deselected which is presumably working
as intended.

---------

Signed-off-by: Bill Murdock <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Qdrant vector db support seems to be broken
3 participants