Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow fastembed embedding model params configuration in parity with fastembed APIs #416

Merged
merged 12 commits into from
Jan 7, 2024

Conversation

praveen-palanisamy
Copy link
Contributor

This PR allows the user to configure the following parameters for the fastembed Embeddings via the qdrant_client.set_model_params(...) method:

max_length (int, optional): The maximum number of tokens.
cache_dir (str, optional): The path to the cache directory.
threads (int, optional): The number of threads single onnxruntime session can use.

in addition to the embedding_model_name parameter. This brings parity with the init args of fastembed's Embedding classes: DefaultEmbedding, FlagEmbedding and JinaEmbedding. This would allow a clean migration for existing fastembed users and switch to the simpler qdrant_client.add(...) API.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Copy link

netlify bot commented Jan 1, 2024

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 76509bf
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/659adbc8dc8c950009fa6be1
😎 Deploy Preview https://deploy-preview-416--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@DiTo97
Copy link

DiTo97 commented Jan 5, 2024

@praveen-palanisamy, when will it be merged?

@joein
Copy link
Member

joein commented Jan 5, 2024

Hi @praveen-palanisamy , thanks for your contribution!

Hi, @DiTo97 I'll try to review it today and see whether we can merge it or not

@praveen-palanisamy
Copy link
Contributor Author

@DiTo97 : Thanks for your interest and the ping. I am waiting for the review/approval before this can be merged. I'll follow-it up.

@joein : Do you have any feedback/review comments to finalize & merge this PR?

@joein
Copy link
Member

joein commented Jan 6, 2024

Hi @praveen-palanisamy

Yeap, I have some comments

Firstly, I'd like to avoid changing names of the existing API

Secondly, I would like to have **kwargs in the method's signature

Tbh, I am not even sure at the moment if we actually need to have max_length, cache_dir and threads as explicit params in the signature, or not. I need to get some context from the team.

P.S. We use python3.10 to generate async client (there is a difference with parentheses starting from python3.11, CI will fail if you generate it with python != 3.10)

@joein
Copy link
Member

joein commented Jan 6, 2024

Related #360

@praveen-palanisamy
Copy link
Contributor Author

@joein Thanks for your comments.

Firstly, I'd like to avoid changing names of the existing API

  1. Okay. I proposed to change the name from set_model to set_model_params because it was more appropriate since this method is not only setting the model but also the associated config params. This new name also matched with the other existing (internal) method naming: _get_model_params. I can revert the name change if you prefer to keep it unchanged.

Secondly, I would like to have **kwargs in the method's signature

  1. Okay. I will update the signature. I am keeping the positional args as is for now with the **kwargs unused. Please see response below:

Tbh, I am not even sure at the moment if we actually need to have max_length, cache_dir and threads as explicit params in the signature, or not. I need to get some context from the team.

  1. Are you suggesting to just fold them all into **kwargs? They are kept as positional arguments in the same order as the underlying init args of fastembed's Embedding classes: DefaultEmbedding, FlagEmbedding and JinaEmbedding. Keeping this parity is preferred from a user point of view. I guess the fastembed API signature may change as it is early in its development cycle. If we want to avoid updating this signature frequently and bring back the parity at a later stage, it makes sense to me. Please confirm the change request after your sync with the team.

P.S. We use python3.10 to generate async client (there is a difference with parentheses starting from python3.11, CI will fail if you generate it with python != 3.10)

  1. Thanks for the tip! I just noticed that the CI runs this check only with Python 3.10.x while the test matrix includes Python versions >=3.8 <=3.12. This check passes with Python 3.11 as well. Any particular reason why Python 3.10.x alone was chosen to run that check/test?

Thanks for the pointer to # 360. That was not enough to allow me as a user to use qdrant_client.add(...) with the desired ONNX session option.

qdrant_client/qdrant_fastembed.py Show resolved Hide resolved
qdrant_client/qdrant_fastembed.py Show resolved Hide resolved
qdrant_client/qdrant_fastembed.py Show resolved Hide resolved
@joein
Copy link
Member

joein commented Jan 7, 2024

@praveen-palanisamy

  1. Yes, I prefer to avoid changes to the existing interface, if there is no necessity.
    2-3. I think that we can have these parameters as keyword arguments, it is not hard to hide them inside **kwargs later if there is a need for this.
  2. As I've told, python3.10 and python3.11 generates different code (3.10 adds parentheses, 3.11 does not). We can't have this check turned on for 3.10 and 3.11 simultaneously, since only one version can pass at the same moment. 3.10 was chosen due to the problems with gRPC generator.

@praveen-palanisamy
Copy link
Contributor Author

praveen-palanisamy commented Jan 7, 2024

@joein : Thanks for the comments and responses! I have resolved the remaining items and also re-run the pre-commit hooks to reformat using Python 3.10.x.
Please let me know if any more changes are required before final CI check & merge.

@joein
Copy link
Member

joein commented Jan 7, 2024

@praveen-palanisamy

Fine by me, thank you for the contribution!

@joein joein self-requested a review January 7, 2024 17:48
@joein joein merged commit 70754db into qdrant:dev Jan 7, 2024
9 checks passed
joein pushed a commit that referenced this pull request Jan 19, 2024
…astembed APIs (#416)

* Update set_model -> set_model_params;
Allow setting all fastembed.DefaultEmbedding init args for full pass-through

* Add fastembed test for setting custom embedding model params

* Update async client gen to include set_model_params

* Update Async client APIs

* Add explicit Optional since PEP 484 is still in effect

* Regenerate async client APIs

* Revert name change: set_model_params -> set_model

* Add **kwargs to set_model method sig

* Update name set_model_params -> set_model

* Regenerate async client APIs

* Use and pass **kwargs

* Regenerate async client APIs using Py 3.10.x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants