-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multiple spaces #457
Conversation
@@ -128,7 +132,7 @@ def add(self, ids, embeddings, update=False): | |||
|
|||
labels = [] | |||
for id in ids: | |||
if id in self._id_to_label: | |||
if hexid(id) in self._id_to_label: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bugfix
@@ -141,7 +145,9 @@ def add(self, ids, embeddings, update=False): | |||
labels.append(next_label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is weird that sometimes we store the UUID vs the id but leaving this for now to minimize the bugs i might introduce.
@@ -71,7 +75,7 @@ class Hnswlib(Index): | |||
_index_metadata: IndexMetadata | |||
_params: HnswParams | |||
_id_to_label: Dict[str, int] | |||
_label_to_id: Dict[int, str] | |||
_label_to_id: Dict[int, UUID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below comment on how its odd that we store the string and the uuid in opposite direction. Updating the typing.
@@ -172,7 +176,7 @@ def persist_generated_data_with_old_version( | |||
coll = api.create_collection( | |||
name=collection_strategy.name, | |||
metadata=collection_strategy.metadata, | |||
embedding_function=collection_strategy.embedding_function, | |||
embedding_function=lambda x: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the tests faster because we pass none across the process boundary because we can't pickle a lambda
Description of changes
Add support for multiple distance metrics in tests.
In the process of writing this test I discovered a bug with our implementation of update that was revealed by the inner product space. Since the inner product is not a true metric, a point may not be a neighbor to itself. Our update code was strictly appending to the index due to the a bug with how we manage string UUID vs UUID objects. In l2 and cosine spaces, this usually was fine in the eyes of tests since the results returned were correct with the updated data. But IP exacerbated the issue by making the results not always be the same point.
Test plan
These are tests!
Documentation Changes
None required!