Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Pydantic v2 #1045

Closed
wants to merge 7 commits into from
Closed

Add Support for Pydantic v2 #1045

wants to merge 7 commits into from

Conversation

srhinos
Copy link

@srhinos srhinos commented Aug 25, 2023

Description of changes

Summarize the changes made by this PR.

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?

Yes, no changes necessary.

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readbility, Modularity, Intuitiveness)

@tazarov
Copy link
Contributor

tazarov commented Aug 25, 2023

@srhinos, Thank you for this. I think part of it is included in - #1033. But let's have a look at actually supporting 2.x. Have you run local tests and are they all passing?

@srhinos
Copy link
Author

srhinos commented Aug 25, 2023

@srhinos, Thank you for this. I think part of it is included in - #1033.

That PR is doing the opposite of what mine is, its further ensuring Pydantic is pinned below 2 where as this PR enables support for both 1.x and 2.x depending on the needs of all the libraries a project is importing.

But let's have a look at actually supporting 2.x.

outright support for v2 is probably not recommended right off the bat. Several other large projects like langchain, llama index, etc are all implementing this same strategy of unpinning <2 and enabling cross compatibility.

This would unblock projects in the short term that are (reasonably) frustrated they can't upgrade several core libraries because of a few packages. Right now, the five lines of chromadb we use in our code base for a niche use case is blocking us from upgrading fastapi and several libraries that have minor CVEs so this is a pretty important upgrade.

In the long term, full v2-only support will require auditing all dependancies of chromadb and submitting PRs on those repos where open source to ensure you can safely install pydantic 2.x.x with them.

Have you run local tests and are they all passing?

Was having trouble getting my env running, would appreciate help fixing tests if my recent commit doesn't fix them!

@cachho
Copy link

cachho commented Aug 25, 2023

Thanks, I've been waiting for this.

@tazarov
Copy link
Contributor

tazarov commented Aug 26, 2023

@srhinos unfortunately the pinned version of fastapi does not work with pydantic 2.x. What do you think should your change also include unpinning fastapi? What consequences will that have to other parts of the code?

@jeffchuber
Copy link
Contributor

@tazarov afaik we pinned to the current version of fastapi mostly because of pydantic v2 as well. we'd like to support it bidirectionally!

@tazarov
Copy link
Contributor

tazarov commented Aug 28, 2023

@jeffchuber gets sketchy when I unpin things:

pip freeze | grep -E 'fastapi|numpy|pydantic'
fastapi==0.103.0
numpy==1.25.2
pydantic==2.3.0
pydantic_core==2.6.3

I just stopped it when most tests started failing:

=================================== 142 failed, 212 passed, 4 xfailed, 4 xpassed, 12069 warnings in 533.74s (0:08:53) ===================================

Most errors are like this:

FAILED chromadb/test/property/test_filtering.py::test_empty_filter[fastapi_persistent] - Exception: {"detail":[{"type":"missing","loc":["query","collection"],"msg":"Field required","input":null,"url":"https://errors.pydantic.dev/2.3/v/mi...
FAILED chromadb/test/property/test_filtering.py::test_boolean_metadata[fastapi_persistent] - Exception: {"detail":[{"type":"missing","loc":["query","collection"],"msg":"Field required","input":null,"url":"https://errors.pydantic.dev/2.3/v/mi...

Ref: https://errors.pydantic.dev/2.3/v/missing

@srhinos
Copy link
Author

srhinos commented Aug 31, 2023

@jeffchuber gets sketchy when I unpin things:

pip freeze | grep -E 'fastapi|numpy|pydantic'
fastapi==0.103.0
numpy==1.25.2
pydantic==2.3.0
pydantic_core==2.6.3

I just stopped it when most tests started failing:

=================================== 142 failed, 212 passed, 4 xfailed, 4 xpassed, 12069 warnings in 533.74s (0:08:53) ===================================

Most errors are like this:

FAILED chromadb/test/property/test_filtering.py::test_empty_filter[fastapi_persistent] - Exception: {"detail":[{"type":"missing","loc":["query","collection"],"msg":"Field required","input":null,"url":"https://errors.pydantic.dev/2.3/v/mi...
FAILED chromadb/test/property/test_filtering.py::test_boolean_metadata[fastapi_persistent] - Exception: {"detail":[{"type":"missing","loc":["query","collection"],"msg":"Field required","input":null,"url":"https://errors.pydantic.dev/2.3/v/mi...

Ref: https://errors.pydantic.dev/2.3/v/missing

I didn't get these same errors locally, instead got some flakey testing errors:

FAILED chromadb/test/property/test_persist.py::test_persist[settings0] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/.hypothesis/examples/bcece959dcd16896/43feee2dfdd8d84d.5cd9f8278778c61bf14b8188416586bb'
FAILED chromadb/test/property/test_persist.py::test_persist_embeddings_state[settings0] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_insert_and_count[sqlite-produce_n_single] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_get[sqlite-produce_n_single] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_fulltext[sqlite-produce_n_single] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_delete[sqlite-produce_n_single] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_upsert[sqlite-produce_n_single] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_insert_and_count[sqlite_persistent-produce_n_single] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_get[sqlite_persistent-produce_n_single] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_fulltext[sqlite_persistent-produce_n_single] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_delete[sqlite_persistent-produce_n_single] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_upsert[sqlite_persistent-produce_n_single] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_insert_and_count[sqlite_persistent-produce_n_batch] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_get[sqlite_persistent-produce_n_batch] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_fulltext[sqlite_persistent-produce_n_batch] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_delete[sqlite_persistent-produce_n_batch] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_upsert[sqlite_persistent-produce_n_batch] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/segment/test_metadata.py::test_insert_and_count[sqlite-produce_n_batch] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_get[sqlite-produce_n_batch] - OSError: [Errno 24] Too many open files: '/Users/mac/Documents/working-dir/ema-repos/chroma/chromadb/migrations/embeddings_queue'
FAILED chromadb/test/segment/test_metadata.py::test_update[sqlite_persistent] - sqlite3.OperationalError: unable to open database file
FAILED chromadb/test/stress/test_many_collections.py::test_many_collections[sqlite_persistent] - OSError: [Errno 24] Too many open files: '/var/folders/jy/k1q5shdx5_sdyk7ykxp39ryw0000gn/T/tmphop7dnbu/e4d85e72-eaaa-4f0a-8930-ab8193b6256e/index_metadata.pickle'

but rerunning all these tests had them pass so locally, saw a fully passing test suite.

EDIT: Noticed I wasn't running pydantic 2 and FastAPI 0.100.0 so getting same errors now :(

@samuelcolvin
Copy link

LMK if there's anything we (the pydantic team) can do to get this over the line.

Seems a shame that chromadb is preventing people from adopting pydantic v2.

@srhinos
Copy link
Author

srhinos commented Sep 17, 2023

LMK if there's anything we (the pydantic team) can do to get this over the line.

Seems a shame that chromadb is preventing people from adopting pydantic v2.

I tried digging into this issue for ~20 mins but couldn't properly decipher chroma's implementation of FastAPI. The test calls a consumer of a backend api thats causing the failure but I just can't find where the backend is actually being hit at.

ex: this test is calling this function which should be making a call to the backend which is ingested @ here but I'm not seeing that function actually trigger while running tests.

if I could find the actual code location of where the pydantic model is being parsed without the proper fields, I could easily correct it but I gave it like an hr of digging before I cut my losses and went back to work on something else.

@HammadB
Copy link
Collaborator

HammadB commented Sep 21, 2023

I don't think this is working due to how fastapi switches support for v1 fastapi/fastapi#9966

It seems like it does not support pydantic.v1 imports.

@HammadB
Copy link
Collaborator

HammadB commented Sep 21, 2023

I added some color here, would love people's thoughts.

#893 (comment)

@HammadB
Copy link
Collaborator

HammadB commented Sep 25, 2023

Closing in favor of #1174 - thanks for helping us think this one through @srhinos

@HammadB HammadB closed this Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants