-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce lazy_import
#5084
Conversation
Pull Request Test Coverage Report for Build 5209189448
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree lazy_import
's api isn't that bad after all, I'm +1 to use it instead of generalimports
. Left a couple of comments but overall it's good
Co-authored-by: Massimiliano Pippi <[email protected]>
haystack/nodes/file_converter/pdf.py
Outdated
from haystack.nodes.file_converter.base import BaseConverter | ||
from haystack.schema import Document | ||
from haystack.lazy_imports import LazyImport | ||
|
||
with LazyImport(message="Run 'pip install farm-haystack[pdf]' or 'pip install pymupdf'.") as fitz_import: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this - if the execution path is here, it means import fitz
is working, as we check it here in the package init https://github.com/deepset-ai/haystack/pull/5084/files#diff-768944ba2fc553e20ba101bc350057d7287308033b4637aa117dcf4f09150b84R21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but there's still the chance someone imports this class directly as from haystack.nodes.file_converter.pdf import PDFToTextConverter
. While this is a weird way to import it, why not give a meaningful error message in this corner case? It doesn't cost us much and removes some "magic" from the failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from haystack.nodes.file_converter.pdf import PDFToTextConverter
would still trigger the code in haystack.nodes.file_converter.__init__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I had a better look and we're both right and wrong 😄 We most likely don't need those lines and I'll remove them, but I have to re-work the import in __init__
to try-catch the first import. I was trying to implement in basically the opposite way, but this seems to work better.
Btw I would like to take half a day to work through #4836 once for all... having all those messy imports is, well, a mess 😅 There are plenty of low-hanging fruit there that can be cleaned up fast. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One leftover and we 🚢
* generalimport -> lazy-imports * remove generalimport * fix pdftotextconverter import check * customize error messages * pylint * fix sql.py * pylint * Update haystack/document_stores/sql.py Co-authored-by: Massimiliano Pippi <[email protected]> * make contextmanager less verbose * do not catch syntax errors * review feedback * Update haystack/nodes/file_converter/pdf.py --------- Co-authored-by: Massimiliano Pippi <[email protected]>
Related Issues
lazy_import
#5085generalimport
v0.5.0 is breaking #5075Proposed Changes:
generalimport
withlazy-imports
How did you test it?
Notes for the reviewer
lazy-imports
is not as "magic" as generalimport, because it required the optional imports to be checked explicitly, and does not catch the imports globally by default. On the other hand, the library is way smaller, less magic, and more stable. At need, it could be easily included into Haystack itself.Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.