-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyPDF2 should not overwrite warnings.formatwarning. #67
Comments
Hello, I'm assuming this issue is the same as #39. This does seem like a major flaw, so we'll make rewriting the warning system a priority. I'm not too familiar with how warning systems are implemented; should PyPDF2 just use the default system? |
Hi, |
I ran into this issue the other day. In case you were entertaining the idea of a stopgap hotfix, you might be able to avoid logging errors by changing this line:
to
Obviously there are less hasty ways of dealing with the problem that are preferable, but this is the hotfix we have used on our deployments. "Nasty side-effects" is right - completely unrelated code that normally sends a warning can end up breaking warnings itself, obscuring the original warning and changing a silent failure to a loud one. On our Django application, we started getting 500's from Apache instead of a nice log and email from Django. |
When pyPdf was created, it wasn't used within large applications - it had fewer features and less PDF compatibility. So, the custom warning overwrite was probably not a big deal. Now that PyPDF2's being used as a part of large applications, I can agree that the custom warning implementation is a problem. I'll work on reverting to the default system, though I need to become familiar with Python's warning/logging system. |
@CrimsonZen I can feel the pain. We had exactly the same problem, but somewhat nastier: Django tried to trigger DeprecationWarning while processing a response. This gave a nice HTTP status code of 200 with a 0-byte response. @mstamy2 We use PyPDF2 in a very limited scope but it was (and probably still is) the most efficient tool for the job. I do not even consider this bug/misbehavior to be a big deal as it is easy to spot and can be circumvented. |
@podloucky-init Thanks for the feedback. On my end, I deleted the overwrites for formatwarning and showwarning. I am not sure how useful the custom implementations even were. So, would it be best to remove the overwrites for all users? |
The custom warning implementations have been removed. The potential issues (3 separate cases reported here) aren't worth the extra information they provide. It would be possible to re-implement the features in a way that doesn't overwrite any methods of |
Sorry for the late reply: You could provide a formatter that formats warnings from PyPDF2. Users can then choose whether they want to use your formatter. EDIT: sorry, clicked on close by mistake. |
I re-implemented warnings with flag The warnings print out code segments when overwriteWarnings is False; we should probably fix that in the future (it looks a bit confusing). Anyway, I hope this resolves the issue successfully. I tried to create a subclass, but I'm not familiar with making subclasses out of modules (is that even possible?). Using a flag should be just as effective, as long as users are aware of it. |
Strong Opinions Time: No library should ever assign to another external library. This is a huge violation of abstraction — let the warnings module handle warnings, and let PyPDF2 handle PDF's.
It may be confusing to some, but it's the normal behavior. Just leave it out. |
@CrimsonZen I think that the author of pyPdf felt justified in overwriting the warnings module because warnings.py seemingly advocates such an action (I don't have the module available right now, but it does mention something similar). You're right, though, and I agree about it being a strong violation of abstraction/OOP in general. Perhaps warnings.py advocates some type of extension, not a direct overwrite, and pyPdf's author misinterpreted. I'll go ahead and permanently remove the overwrites then. I was hesitant to do so at first because PyPDF2 aims to be a direct successor to pyPdf (adding/correcting features rather than removing), but it's probably for the best. I'm also assuming that whatever features provided by the custom implementations of showwarning and formatwarning weren't overly useful. I think the intent was to
Anyway, thanks for your input - I'll re-remove the warnings overwrite tomorrow (I'm currently on a mobile device). |
Hello, Thank you for great software, apart from this works nice for us. |
Hi @pavelbrych I've fixed it here (I hope): opendesk@47b3e6d I've done a PR to the library. Cheers |
Fyi, it interferes with pandas as well. |
Please fix that ...
Same as @Kiza it crashes when Pandas wants to show warning @mstamy2 I saw your
|
Same as @Kiza and @CartierPierre: it breaks with Pandas warnings |
Still not fixed, I was becoming crazy trying to understand how numpy could cause PyPDF2 errors... |
I use PdfFileMerger in a multithreaded application and this has major effects(turns warnings into exceptions). I strongly suggest implementing overwriteWarnings for PdfFileMerger as per https://github.com/mstamy2/PyPDF2/pull/243/files |
PyPDF2 must not overwrite showwarning because it breaks ERP5. ERP5 already overwrites it properly py-pdf/pypdf#67 (comment)
PyPDF2 overwrites `warning.showwarning` and this is already confirmed as a major flaw: py-pdf/pypdf#67 (comment) But, this is never applied in PyPDF2. To fix this issue in ERP5, we will copy the patch to our stack and apply it. See merge request !746
In the following issues and PRs, PyPDF2 users have identified and offered to fix a weird bug where the library overrides normal logging in a way that breaks when you log certain things: py-pdf/pypdf#67 py-pdf/pypdf#641 py-pdf/pypdf#452 py-pdf/pypdf#243 Not great, but there was probably a reason. Unfortunately, the maintainers aren't merging any of the fixes people have provided over the years, and when I upgraded to Python 3.10 one of our tests changed in a way that it triggered this bug. So, since the maintainers don't seem inclined to fix their own package, this commit yanks it from CourtListener. It's good timing, really, since we now have the microservice available, but it was disappointing to see bugs and PRs related to this going back to 2014. Most of the fixes are one or two-line PRs too. Bummer.
I think this issue got solved with 89a29ef (February 2014, v1.25.1):
|
Uh, I've just seen |
This helps users who run into issue #67
Let's connect the issues:
|
This was fixed with PyPDF2 2.0.0. Thank you all for pointing out the issues! |
Hello,
PyPDF2 1.2.0 overwrites warnings.formatwarning with its own implementation (utils._formatwarning) in pdf.py line 74:
Unfortunately this may cause severe side-effects if PyPDF2 is imported in a larger application. In our case the PyPDF2 implementation of formatwarning caused IndexErrors whenever a warning was raised somewhere else (and the filename argument was not to the formatter's liking).
Personally, I do not think that it is a good idea for a library to interfere with the global logging/warning infrastructure.
P.S.: Apart from this problem, we have been using PyPDF2 successfully for some time now. Nice piece of software!
The text was updated successfully, but these errors were encountered: