-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PageObject.transfer_rotation_to_content()
hides some content since pypdf 4.3.0
#2927
Comments
PageObject.transfer_rotation_to_content()
hides content since pypdf 4.3.0PageObject.transfer_rotation_to_content()
hides some content since pypdf 4.3.0
I managed to create a standalone example in the meantime: test_clean.pdf Please note that this might show further issues due to the cleanup done by me. After running the above code with pypdf version 4.2.0 and 4.3.0, I get the following diff: diff --git a/result_4.2.0.pdf b/result_4.3.0.pdf
index 04d3347..72ec47e 100644
--- a/result_4.2.0.pdf
+++ b/result_4.3.0.pdf
@@ -72,7 +72,7 @@ endstream
endobj
8 0 obj
<<
-/Length 992
+/Length 990
>>
stream
q
@@ -122,7 +122,6 @@ BI
ID /221̎215346^PT^PBS377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377377360^A^@^P
EI
Q
-Q
q
110 170 5520 7850 re
W
@@ -177,8 +176,8 @@ xref
0000000576 00000 n
0000000845 00000 n
0000001785 00000 n
-0000002828 00000 n
-0000002865 00000 n
+0000002826 00000 n
+0000002863 00000 n
trailer
<<
/Size 11
@@ -186,5 +185,5 @@ trailer
/Info 10 0 R
>>
startxref
-2929
+2927
%%EOF The most apparent change seems to be that there is one The output files: result_4.2.0.pdf result_4.3.0.pdf You can already see that the "abc" text disappeared. When rendering this as PNG through Ghostscript, we can see that the white circles disappear as well. For 4.2.0: For 4.3.0: |
The offending commit appears to be 23a81ba, which makes sense as the offending image is an inline image (although never requesting it explicitly). |
Further debugging shows the following behavior:
From this, some questions arise for me regarding the new implementation:
|
Partially answering my own questions after changing the input stream position on
|
Calling
page.transfer_rotation_to_content()
changes the visibility of some content after upgrading from version 4.2.0 to 4.3.0 for some PDF files. The corresponding text layer is invisible, but can be selected.When viewing the diff, two
Q
operators are missing in version 4.3.0.Environment
Which environment were you using when you encountered the problem?
Code + PDF
This is a minimal, complete example that shows the issue:
I do not have a suitable PDF file at the moment, but I am working on getting one.
The text was updated successfully, but these errors were encountered: