Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken characters when merging pages #1640

Closed
raphaelm opened this issue Feb 17, 2023 · 7 comments · Fixed by #1641
Closed

Broken characters when merging pages #1640

raphaelm opened this issue Feb 17, 2023 · 7 comments · Fixed by #1641
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF

Comments

@raphaelm
Copy link

raphaelm commented Feb 17, 2023

We are using PyPDF to implement a "n-up" feature in our application. With the upgrade from PyPDF 2.12.x to PyPDF 3.x and the fix for #1601, this now generally works again but breaks in funny ways with text.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.0.9-arch1-1-x86_64-with-glibc2.36

$ python -c "import pypdf;print(pypdf.__version__)"
3.4.1

Code + PDF

This is a minimal, complete example that shows the issue:

from decimal import Decimal
from pypdf import PdfWriter, PdfReader, Transformation
from pypdf.generic import RectangleObject

mm = 72.0 / 2.54 * 0.1

out_pdf = PdfWriter()
page_1 = out_pdf.add_blank_page(210 * mm, 297 * mm)

in_pdf_1 = PdfReader('badges_3vjrh_7LXDZ_1-1.pdf')
in_page_1 = in_pdf_1.pages[0]
page_1.merge_page(in_page_1)

in_pdf_2 = PdfReader('badges_3vjrh_7LXDZ_2-1.pdf')
in_page_2 = in_pdf_2.pages[0]
in_page_2.add_transformation(Transformation().translate(0, +150 * mm))
in_page_2.mediabox = RectangleObject((
    Decimal('%.5f' % (in_page_2.mediabox.left.as_numeric())),
    Decimal('%.5f' % (in_page_2.mediabox.bottom.as_numeric() + 150 * mm)),
    Decimal('%.5f' % (in_page_2.mediabox.right.as_numeric() )),
    Decimal('%.5f' % (in_page_2.mediabox.top.as_numeric() + 150 * mm))
))
in_page_2.trimbox = in_page_2.mediabox
page_1.merge_page(in_page_2)

out_pdf.write('merge.pdf')

Input files:
badges_3vjrh_7LXDZ_1-1.pdf
badges_3vjrh_7LXDZ_2-1.pdf

Expected output

The expected output is this. We can obtain this output by using e.g. PyPDF 2.12.1:
pypdf2.pdf

Actual output

The actual output from PyPDF3 is this:
pypdf3.pdf

Note that the first line now reads "Hans-Jörgen" instead of "Hans-Jürgen"

raphaelm added a commit to pretix/pretix that referenced this issue Feb 17, 2023
@MartinThoma
Copy link
Member

The actual output from PyPDF3 is this:
pypdf3.pdf

I guess you mean pypdf==3.4.1, right?
I'm asking because PyPDF3 is a completely different project.

@pubpub-zz
Copy link
Collaborator

Error found. not referening the good object. Funny effect and quite tricky to locate
I've produced the PR If you want to try

@raphaelm
Copy link
Author

I guess you mean pypdf==3.4.1, right?
I'm asking because PyPDF3 is a completely different project.

Yes! Sorry.

Error found. not referening the good object. Funny effect and quite tricky to locate
I've produced the PR If you want to try

It's amazing how quick you are ❤️ Happy to test the PR on Monday!

@raphaelm
Copy link
Author

Yup, PR seems to work for me! :)

@MartinThoma
Copy link
Member

The fix was just merged and will be in pypdf>3.4.1 (this weekend on PyPI)

@MartinThoma
Copy link
Member

Thank you for reporting it! If you want I can add you as a contributor: https://pypdf.readthedocs.io/en/latest/meta/CONTRIBUTORS.html

@raphaelm
Copy link
Author

Nah, that's fine, but thanks! :)

@MartinThoma MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Mar 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants