Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate add to prevent duplicate IDs. #363

Merged
merged 10 commits into from
Apr 17, 2023

Conversation

levand
Copy link
Contributor

@levand levand commented Apr 16, 2023

Description of changes

Throw errors if the user attempts to add duplicate IDs, either via separate calls to Collection.add or the same one.

Test plan

This branch is based of #355 which contains failing tests for this functionality.

Documentation Changes

chroma-core/docs#42

@levand levand requested review from atroyn and jeffchuber April 16, 2023 16:47
@levand levand changed the title Validate add Validate add to prevent duplicate IDs. Apr 16, 2023
@levand levand requested a review from HammadB April 16, 2023 16:47
@atroyn
Copy link
Contributor

atroyn commented Apr 17, 2023

@levand let's find time to review in person

chromadb/api/local.py Outdated Show resolved Hide resolved
@levand levand force-pushed the lukev/validate_add branch from 15643da to 25d451f Compare April 17, 2023 18:49
metadatas: List[types.Metadata]
documents: List[types.Document]


class EmbeddingStateMachine(RuleBasedStateMachine):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename TestCollectionContents

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do as a separate PR.

self._add_embeddings(embedding_set)
return multiple(*embedding_set["ids"])
if set(embedding_set["ids"]).intersection(set(self.embeddings["ids"])):
with pytest.raises(ValueError):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test that the value error is correct.

@@ -129,8 +160,8 @@ def _add_embeddings(self, embeddings: strategies.EmbeddingSet):
else:
documents = [None] * len(embeddings["ids"])

self.embeddings["metadatas"].extend(metadatas) # type: ignore
self.embeddings["documents"].extend(documents) # type: ignore
self.embeddings["metadatas"] += metadatas # type: ignore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this bit.

def test_dup_add(api):
api.reset()
coll = api.create_collection(name="foo")
with pytest.raises(ValueError):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create more specific error type.

levand added 5 commits April 17, 2023 15:54
This reverts commit c5b096e.

We will handle TypeScript changes in a separate PR.
This reverts commit 1a46040.

We will handle TypeScript changes in a separate PR
@levand levand force-pushed the lukev/validate_add branch from 35f93c2 to 693a53a Compare April 17, 2023 23:19
@levand
Copy link
Contributor Author

levand commented Apr 17, 2023

Reviewed in-person with @atroyn and incorporated feedback prior to merging.

@levand levand merged commit 252e56d into lukev/embeddings-stateful-tests Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants