-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade FastEmbed Version #493
Conversation
✅ Deploy Preview for poetic-froyo-8baba7 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
qdrant_client/qdrant_fastembed.py
Outdated
"BAAI/bge-small-en-v1.5": (384, models.Distance.COSINE), | ||
"BAAI/bge-base-en-v1.5": (768, models.Distance.COSINE), | ||
"intfloat/multilingual-e5-large": (1024, models.Distance.COSINE), | ||
} | ||
|
||
|
||
class QdrantFastembedMixin(QdrantBase): | ||
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en" | ||
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en-v1.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can just silently replace the default model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make at least one major release which warns that we are going to change the model
|
||
SUPPORTED_EMBEDDING_MODELS: Dict[str, Tuple[int, models.Distance]] = { | ||
"BAAI/bge-base-en": (768, models.Distance.COSINE), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't just silently remove models which could have already been used by users
0803fb5
to
34f6037
Compare
We need to fix our tests to run them without fastembed installed as well |
510efbc
to
9323c0a
Compare
"BAAI/bge-small-en-v1.5": (384, models.Distance.COSINE), | ||
"BAAI/bge-base-en-v1.5": (768, models.Distance.COSINE), | ||
"intfloat/multilingual-e5-large": (1024, models.Distance.COSINE), | ||
} | ||
|
||
|
||
class AsyncQdrantFastembedMixin(AsyncQdrantBase): | ||
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en" | ||
embedding_models: Dict[str, "DefaultEmbedding"] = {} | ||
DEFAULT_EMBEDDING_MODEL = "BAAI/bge-small-en-v1.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel so good about this change. We should deprecate things, not suddenly disable defaults
* Update fastembed to v0.2.1 * chore(qdrant_fastembed.py): update DEFAULT_EMBEDDING_MODEL * fix(fastembed integration): upgrade to latest version * Prefer black over ruff * Prefer black over ruff * Remove hardcoded directory structure from Qdrant Client checks * new: deprecate current default model, deprecate max token length, update fastembed * fix: make embedding_model_name method sync * fix: update poetry lock * refactor: use list_supported_models() (#501) * fix: fix fastembed check * fix: fix fastembed class var assignment * fix: remove fastembed deprecation from qdrant client (#524) --------- Co-authored-by: George Panchuk <[email protected]> Co-authored-by: Anush <[email protected]>
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
pre-commit
withpip3 install pre-commit
and set up hooks withpre-commit install
?Changes to Core Features: