Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade FastEmbed Version #493

Merged
merged 13 commits into from
Mar 5, 2024
Merged

Upgrade FastEmbed Version #493

merged 13 commits into from
Mar 5, 2024

Conversation

NirantK
Copy link
Contributor

@NirantK NirantK commented Feb 15, 2024

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Copy link

netlify bot commented Feb 15, 2024

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 28881be
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/65e778d0e802e10008677e55
😎 Deploy Preview https://deploy-preview-493--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

"BAAI/bge-small-en-v1.5": (384, models.Distance.COSINE),
"BAAI/bge-base-en-v1.5": (768, models.Distance.COSINE),
"intfloat/multilingual-e5-large": (1024, models.Distance.COSINE),
}


class QdrantFastembedMixin(QdrantBase):
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en"
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en-v1.5"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can just silently replace the default model

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make at least one major release which warns that we are going to change the model


SUPPORTED_EMBEDDING_MODELS: Dict[str, Tuple[int, models.Distance]] = {
"BAAI/bge-base-en": (768, models.Distance.COSINE),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't just silently remove models which could have already been used by users

@joein joein force-pushed the nirant/upgrade-fastembed-version branch from 0803fb5 to 34f6037 Compare March 2, 2024 22:32
@joein
Copy link
Member

joein commented Mar 2, 2024

We need to fix our tests to run them without fastembed installed as well

@joein
Copy link
Member

joein commented Mar 3, 2024

#522

@NirantK NirantK changed the title Nirant/upgrade-fastembed-version Upgrade FastEmbed Version Mar 4, 2024
@joein joein force-pushed the nirant/upgrade-fastembed-version branch from 510efbc to 9323c0a Compare March 4, 2024 15:40
"BAAI/bge-small-en-v1.5": (384, models.Distance.COSINE),
"BAAI/bge-base-en-v1.5": (768, models.Distance.COSINE),
"intfloat/multilingual-e5-large": (1024, models.Distance.COSINE),
}


class AsyncQdrantFastembedMixin(AsyncQdrantBase):
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en"
embedding_models: Dict[str, "DefaultEmbedding"] = {}
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en-v1.5"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel so good about this change. We should deprecate things, not suddenly disable defaults

@generall generall requested a review from joein March 5, 2024 18:39
@joein joein merged commit fc4b3cf into dev Mar 5, 2024
14 checks passed
joein added a commit that referenced this pull request Mar 5, 2024
* Update fastembed to v0.2.1

* chore(qdrant_fastembed.py): update DEFAULT_EMBEDDING_MODEL

* fix(fastembed integration): upgrade to latest version

* Prefer black over ruff

* Prefer black over ruff

* Remove hardcoded directory structure from Qdrant Client checks

* new: deprecate current default model, deprecate max token length, update fastembed

* fix: make embedding_model_name method sync

* fix: update poetry lock

* refactor: use list_supported_models() (#501)

* fix: fix fastembed check

* fix: fix fastembed class var assignment

* fix: remove fastembed deprecation from qdrant client (#524)

---------

Co-authored-by: George Panchuk <[email protected]>
Co-authored-by: Anush <[email protected]>
@NirantK NirantK deleted the nirant/upgrade-fastembed-version branch March 6, 2024 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants