-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: improve support for dataclasses #3142
refactor: improve support for dataclasses #3142
Conversation
Current code need refactoring due to Optional arg and mypy. I'll push new commit. |
Hey @danielbichuetti, wow thank you! 🤩 Can I get started with the review already or you prefer me to wait for your fixes for mypy and typing? |
Hi @ZanSara. I will revert the initialization changes, so mypy results are ok with pydantic, and code can get checked by other tests. I've started to work on some refactoring, looking for a better mypy integration. As a side note, am I the only one getting issues with VS Code and pre-commits? |
Ok, beyond the mypy checking, there are differences when running tests on Python 3.9 and the CI which uses 3.7. I'm looking at what causes the errors in the different environments. As a complement for further testing:
And on CI, Python 3.7:
|
Which kind of issues? We've had in the past but I hoped they were mostly solved by now. Let's open an issue for this topic: even though it might be a simple misconfiguration, it's always helpful to leave an issue with the discussion for other users |
Alright! Given that there are still some fixes to be done, I'll set this one to draft until the CI looks good again |
TODO: investigate another method to improve 3.7 compatibility.
Tests are failing due to timeouts into haystack CI. At the same time, I have run them using my profile. All have run ok: https://github.com/danielbichuetti/haystack/actions/runs/2982709078 I'll force one extra commit tomorrow, maybe CI run correctly. |
May someone check CI please ? It has skipped some tests consistently. |
The HF models caching fails quite often, besides the timeouts on the download process the problem is the cache shouldn't expire so often, I wonder if we're hitting the 10Gb size limit. Opened #3146 to track that, in the meantime re-running the job did the trick here ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @danielbichuetti! Thanks a ton for this PR! Sorry if I've been a bit overzealous with the review, but this is a delicate change, so it's better to be extra-careful. Good news is, there are no big issues with the changes, it's just many small improvements.
Let me know if you have any question or if I got anything wrong 🙂
@@ -50,7 +50,7 @@ dependencies = [ | |||
"importlib-metadata; python_version < '3.8'", | |||
"torch>1.9,<1.13", | |||
"requests", | |||
"pydantic==1.9.2", | |||
"pydantic", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth pinning pydantic<2
to avoid having the same problems all over again when Pydantic 2.0 comes out? @masci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, I don't think it's necessary.
Here you can see what will be removed: https://pydantic-docs.helpmanual.io/blog/pydantic-v2/#removed-features-limitations.
The dataclasses will only have internal changes, not on the external interface. https://pydantic-docs.helpmanual.io/blog/pydantic-v2/#features-remaining
Furthermore, pydantic v2 will have a performance increase of at least 4x. So, its worth using it.
5edadf3
to
6626e1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Thanks again 😊
@ZanSara I think you approved and forgot to merge 😆 |
Thank you for the ping!! 😄 I did forget to merge it 😄 |
* refactor: improve support for dataclasses * refactor: refactor class init * refactor: remove unused import * refactor: testing 3.7 diffs * refactor: checking meta where is Optional * refactor: reverting some changes on 3.7 * refactor: remove unused imports * build: manual pre-commit run * doc: run doc pre-commit manually * refactor: post initialization hack for 3.7-3.10 compat. TODO: investigate another method to improve 3.7 compatibility. * doc: force pre-commit * refactor: refactored for both Python 3.7 and 3.9 * docs: manually run pre-commit hooks * docs: run api docs manually * docs: fix wrong comment * refactor: change no type-checked test code * docs: update primitives * docs: api documentation * docs: api documentation * refactor: minor test refactoring * refactor: remova unused enumeration on test * refactor: remove unneeded dir in gitignore * refactor: exclude all private fields and change meta def * refactor: add pydantic comment * refactor : fix for mypy on Python 3.7 * refactor: revert custom init * docs: update docs to new pydoc-markdown style * Update test/nodes/test_generator.py Co-authored-by: Sara Zan <[email protected]>
Related Issues
Proposed Changes:
The latest update on pydantic caused some tests to fail. The main reason is related to some initialization pydantic do into dataclasses for his own internal logic. These extra fields were being mutated to Document.meta in reason of its logic. To avoid this error, I have filtered the new pydantic current initialization variable and also the old one (<1.10). This will make it more consistent.
Furthermore, I've refactored class initialization to start the usage of the dataclasses features. The custom logic has been moved to private post_init method.UPDATE: because of some differences at dataclasses initialization in Python 3.7 and 3.9, code had to be refatored and the exclusive usage of post_init has been dropped. Using custom init and InitVar worked on supported environments.
How did you test it?
Currently making complete tests
Notes for the reviewer
Checklist