-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for collections property tests #328
Fixes for collections property tests #328
Conversation
@@ -143,6 +143,9 @@ def create_collection( | |||
|
|||
if len(dupe_check) > 0: | |||
if get_or_create: | |||
if dupe_check[0][2] != metadata: | |||
self.update_collection(name, new_name=name, new_metadata=metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update the documentation with this behavior? (get_or_create will update metadata)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is strictly more intuitive than the current behavior of just dropping it, so I am not sure we need it in the tutorials or overview docs.
I just pushed a change to add it ito the docstrings so it will be present in generated API docs when we have those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to add to this https://docs.trychroma.com/usage-guide#creating-inspecting-and-deleting-collections a note on the behavior of metadata. Better than leaving it implicit IMO - wdyt? We already have an example for modify() at the bottom of that section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will add that before merging this.
chromadb/api/models/Collection.py
Outdated
@@ -29,6 +29,9 @@ | |||
from chromadb.api import API | |||
|
|||
|
|||
default_embedding_function = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be cleaner to push the singleton down to the EF level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean have a separate singleton for each possible EF? We certainly could: but I'm not sure it's applicable in all cases. For example, in the multi-tenant hosted version, users might entrust us with their OpenAI API keys and we would definitely not want those in a singleton. This is probably only appropriate for the default embedding function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - just for the default we'd change the class so it only has a singleton model, rather than making the singleton live at this level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, make sense, will do.
@@ -82,6 +82,10 @@ def create_collection( | |||
dupe_check = self.get_collection(name) | |||
if len(dupe_check) > 0: | |||
if get_or_create is True: | |||
if dupe_check[0][2] != metadata: | |||
self.update_collection(name, new_name=name, new_metadata=metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally - will this be the same behavior as upsert with embedding metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope so. That won't get merged until after this, so I think we should argue that "replace" makes more sense than "merge" across the board.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok great. Let's make sure that what we do.
This reverts commit f48a07a. Going to use a different approach.
Saves a lot of time during testing by not re-constructing them all the time.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I think @HIMANSHUSINGHYANIA is referencing above some of the areas we reference collection.modify and create, my suggestion was to add the metadata to those calls as a quick clarification. Documentation should always be in scope if we are modifying code and it is undocumented - otherwise we'll rarely get to it.
Description of changes
get_or_create
), update the metadata if it's different from the previous value.Test plan
These changes are to satisfy the new property-based tests for Collections, which now pass.
Documentation Changes
No user facing changes; documented behavior was not changed.