-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Format floats using their intrinsic decimal precision #1267
Conversation
e73cb59
to
d7d447c
Compare
Codecov ReportBase: 94.63% // Head: 94.63% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1267 +/- ##
=======================================
Coverage 94.63% 94.63%
=======================================
Files 30 30
Lines 5140 5141 +1
Branches 1058 1058
=======================================
+ Hits 4864 4865 +1
Misses 164 164
Partials 112 112
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
5611919
to
a71d15b
Compare
…ng to 5 decimal places Explicitly format floats in outline color test so they can be compared
rather than adding a precision property to FloatObject
2cfe102
to
9766c75
Compare
Rebased this PR so tests are passing, and believe all the changes requested in the last review have been addressed. Could you take another look, @MasterOdin? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good from my side - thank you for writing a unit test 🤗
@MasterOdin You were a lot more involved in this PR than I was. What do you think? |
Looks good. Thanks for the work here @programmarchy! 👍 |
Thank you both for all the work you put in it 🙏 I've just merged the PR and I will release it today to PyPI :-) |
@programmarchy If you want, I can add you to https://pypdf2.readthedocs.io/en/latest/meta/CONTRIBUTORS.html :-) |
New Features (ENH): - Add rotation property and transfer_rotate_to_content (#1348) Performance Improvements (PI): - Avoid string concatenation with large embedded base64-encoded images (#1350) Bug Fixes (BUG): - Format floats using their intrinsic decimal precision (#1267) Robustness (ROB): - Fix merge_page for pages without resources (#1349) Full Changelog: 2.10.8...2.10.9
That would be very cool, thank you @MartinThoma! |
When you use |
Interesting, would you mind sharing how you came to find out 20 digits is the tipping point for Acrobat, @mrknwk? One way I was thinking of to make this configurable would be to adopt context vars as implemented in decimal.Context for example. The context provides sane defaults with a central point for changing behavior. It would allow us to write something like: import PyPDF2
from PyPDF2 import PdfReader, PdfWriter
from PyPDF2.context import Context, StripExtraTrailingZeros, QuantizeInteger
ctx = StreamContext()
ctx.max_prec = 5 # specify maximum precision
ctx.flags = [
StripExtraTrailingZeros,
QuantizeInteger,
] # could also specify additional format flags
PyPDF2.setcontext(ctx)
reader = PdfReader("./path/to/file.pdf")
reader.pages[0].scale_by(0.5)
writer = PdfWriter()
writer.add_page(reader.pages[0])
... Or like this: with PyPDF2.localcontext() as ctx:
ctx.max_prec = 5 # specify maximum precision
ctx.flags = [
StripExtraTrailingZeros,
QuantizeInteger,
] # could also specify additional format flags
... Or maybe this: PyPDF2.setcontext(AdobeAcrobactContext) Might make sense to open a separate issue to discuss further. |
@programmarchy It really was just trial and error. 😊 But 20 digits is also the limit that one of the maintainers of PDF Arranger found in a test. He contacted Adobe about it and apparently it is an Acrobat "implementation level limitation". So maybe the third option would be a nice way to go then. |
@mrknwk I'd be happy to take a stab at implementing the above. Could you please create a GitHub issue with a corresponding sample PDF, and tag me? |
Since
FloatObject
is represented as a decimal, format numbers using their intrinsic precision, instead of reducing the precision to 5 decimal places.This fixes rendering issues for PDFs that contain coordinates, transformations, etc. with real numbers containing more than 5 decimal places of precision. For example, PDFs exported from Microsoft PowerPoint contain numbers with up to 11 decimal places.
Fixes: #1266