Team/hypothesis tests #474

HammadB · 2023-05-06T00:25:47Z

Description of changes

Merges Team Hypothesis Tests

Test plan

These are tests.

Documentation Changes

None

…lidation Allow capital letters in collection names

This reverts commit f48a07a. Going to use a different approach.

Saves a lot of time during testing by not re-constructing them all the time.

* Progress toward coverage * Updated assumptions * temporarily generate less data * use full-dimension records * Use recall thresholds for ann_invariant * Tests passing * Address comment * Fix buggy regex in index name check * Embale tests in vscode * Nit --------- Co-authored-by: Luke VanderHart <[email protected]>

Fixes for collections property tests

…ine-tests

…ne-tests Collection state machine tests

…eful-tests

Per PR review feedback

…-tests Apply Hypothesis tests as integration tests

Test unwrapped values

Fix glob syntax in CI config

Add in multiple versions of python during CI. Add typed_extensions which lets us patch back in types not supported in older versions

Adds a hypothesis test for filtering the query.

Validate duplicate IDs for JS client

Upsert support for JS

Add support for multiple distance metrics in tests. We coin-flip and sometimes add a space when using hnsw_params Added the distance functions to the invariant and use them when needed. In the process of writing this test I discovered a bug with our implementation of update that was revealed by the inner product space. Since the inner product is not a true metric, a point may not be a neighbor to itself. Our update code was strictly appending to the index due to the a bug with how we manage string UUID vs UUID objects. In l2 and cosine spaces, this usually was fine in the eyes of tests since the results returned were correct with the updated data. But IP exacerbated the issue by making the results not always be the same point.

Add PR checklist

* Hashing EF * Draf from EF strategy * debug * Remove test * Finalized tests * Restore logging message * Log accuracy threshold * Remove normalization, TODOs * Address comments * Fix list wrapping to pass docs to EF * Address comments

HammadB · 2023-05-06T00:36:02Z

chromadb/db/clickhouse.py

@@ -143,6 +153,9 @@ def create_collection(

        if len(dupe_check) > 0:
            if get_or_create:
+                if dupe_check[0][2] != metadata:


This behavior is confusing

HammadB · 2023-05-06T00:37:57Z

chromadb/db/duckdb.py

@@ -82,6 +82,10 @@ def create_collection(
        dupe_check = self.get_collection(name)
        if len(dupe_check) > 0:
            if get_or_create is True:
+                if dupe_check[0][2] != metadata:


This behavior is confusing

levand and others added 30 commits April 6, 2023 17:27

initial cut of hypothesis-based property tests

0f80148

Allow capital letters in collection names

d3e17ea

WIP on collection state machine test

b45e2cf

add clean failing minimal examples

217502b

fix incorrect test logic

86888f9

Fix collection name validation

785d3c1

only construct default embedding function once

f48a07a

update metadata when doing 'upsert' on collection

69c6822

re-enable all test api fixtures

9e9e97c

Merge pull request #302 from chroma-core/lukev/fix-collection-name-va…

470c0cb

…lidation Allow capital letters in collection names

Update docstrings to reflect metadata upsert behavior

02bb481

Revert "only construct default embedding function once".

f23ba4b

This reverts commit f48a07a. Going to use a different approach.

Use class var to store SentenceTransformer instances

34006bc

Saves a lot of time during testing by not re-constructing them all the time.

Merge pull request #328 from chroma-core/lukev/fix-collections

2558481

Fixes for collections property tests

Merge branch 'team/hypothesis-tests' into lukev/collection-state-mach…

d6a308a

…ine-tests

Merge pull request #324 from chroma-core/lukev/collection-state-machi…

214c06c

…ne-tests Collection state machine tests

state machine tests for embeddings

6e20759

remember to reset before each unit test

7fd4233

if creation fails, finish step

3707a35

temporarily generate IDs that we know won't cause SQL issues

7b213ac

Merge branch 'lukev/hypothesis-test-fixes' into lukev/embeddings-stat…

682445d

…eful-tests

add failing tests for duplicate embeddings

5e7940d

add update to embedding stateful tests

f7e3874

valiation to prevent dup ID inserts

d036595

add JS validation & tests

1a46040

use unique IDs in unit tests

8671504

fix js test to handle local validation

c5b096e

ensure that documents are populated for updates

25d451f

clean unused code

87802f0

levand and others added 24 commits May 2, 2023 11:35

Split apart tests to match what's currently in main

5cd22fb

Per PR review feedback

factor out upsert tests to their own file

016315c

Merge pull request #398 from chroma-core/lukev/hypothesis-integration…

f732ede

…-tests Apply Hypothesis tests as integration tests

Merge branch 'team/hypothesis-tests' into lukev/test-unwrapped-values

746ce9e

Merge pull request #451 from chroma-core/lukev/test-unwrapped-values

7ca1e7b

Test unwrapped values

Merge branch 'team/hypothesis-tests' into lukev/validate-add-js

a1b9347

Merge branch 'team/hypothesis-tests' into lukev/upsert-js

f39f049

fix bug with intended test partition; actually exclude prop tests

caa03d0

poke CI

824a406

Merge pull request #454 from chroma-core/lukev/fix-glob-in-ci-config

5c929c9

Fix glob syntax in CI config

Merge branch 'team/hypothesis-tests' into lukev/upsert-js

e1de81f

Merge branch 'team/hypothesis-tests' into lukev/validate-add-js

5904549

python version matrix (#448)

487a48e

Add in multiple versions of python during CI. Add typed_extensions which lets us patch back in types not supported in older versions

Query filtering (#453)

9500e2a

Adds a hypothesis test for filtering the query.

Merge pull request #377 from chroma-core/lukev/validate-add-js

3fdd908

Validate duplicate IDs for JS client

Merge branch 'team/hypothesis-tests' into lukev/upsert-js

dec24e7

Merge pull request #399 from chroma-core/lukev/upsert-js

be04dca

Upsert support for JS

PR checklist (#459)

833b89a

Add PR checklist

Fix PR review checklist

cfdf89c

Test embedding functions (#466)

891f637

* Hashing EF * Draf from EF strategy * debug * Remove test * Finalized tests * Restore logging message * Log accuracy threshold * Remove normalization, TODOs * Address comments * Fix list wrapping to pass docs to EF * Address comments

merge main into team hypothesis test

8dfb223

Merge branch 'team/merge_team_hypothesis' into team/hypothesis-tests

2f52675

merge main

a19af5c

HammadB commented May 6, 2023

View reviewed changes

HammadB added 2 commits May 5, 2023 17:49

inaccurate log

fefab56

Add epsilon for norms in cosine per hnswli

08d4fc0

HammadB merged commit f9b8f7c into main May 6, 2023

schwebke mentioned this pull request May 8, 2023

check for none instead of truth for optional array arguments #408

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Team/hypothesis tests #474

Team/hypothesis tests #474

HammadB commented May 6, 2023

HammadB May 6, 2023

HammadB May 6, 2023

Team/hypothesis tests #474

Team/hypothesis tests #474

Conversation

HammadB commented May 6, 2023

Description of changes

Test plan

Documentation Changes

HammadB May 6, 2023

Choose a reason for hiding this comment

HammadB May 6, 2023

Choose a reason for hiding this comment