Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

page1.merge_page(page2, expand=True) doesn't seem to work #1035

Closed
GeoTN opened this issue Jun 28, 2022 · 10 comments · Fixed by #1208
Closed

page1.merge_page(page2, expand=True) doesn't seem to work #1035

GeoTN opened this issue Jun 28, 2022 · 10 comments · Fixed by #1208

Comments

@GeoTN
Copy link

GeoTN commented Jun 28, 2022

I am trying to merge two PDF page into one page. One thing is, those Pdf are kinda long (3 meters each, so 6 meters in the end)
When I try to merge those two pdf, the second PDF doesn't write past a certain area.

Code

import PyPDF2 as pdf
from PyPDF2.generic import RectangleObject
from PyPDF2 import PageObject, PdfReader, PdfWriter, Transformation

reader = PdfReader("E036_000_Tralongpdf2.0.pdf")
page = reader.pages[0]

x1, y1, x2, y2 = page.cropbox
op1 = Transformation().translate(tx=0, ty=0)
page.add_transformation(op1)
reader2 = PdfReader("E036_000_Tralongpdf1.0.pdf")
page2 = reader2.pages[0]

x21, y21, x22, y22 = page2.cropbox
assert x21 == 0
assert y21 == 0

op2 = Transformation().translate(tx=0, ty=y22 - 2000)  # ty=y2)
page2.add_transformation(op2)
page3 = PageObject.createBlankPage(reader, width=845, height=19000)
rect = RectangleObject((0.0, 0.0, 845.0, 25000.0))
print(page3.mediabox)
page3.cropbox = rect
page3.merge_page(page2)
page3.merge_page(page)

writer = PdfWriter()
page3.trimBox = rect
print(page3.cropbox)
writer.add_page(page3)
with open("output.pdf", "wb") as fp:
    writer.write(fp)

Environment

PyPDF==2.4.0

@MartinThoma
Copy link
Member

Pdf are kinda long (3 meters each, so 6 meters in the end)

Ok 😄 Can you share those?

@MartinThoma MartinThoma changed the title PdfObject.merge_page( pageX,Expand) doesn't seem to work page1.merge_page(page2, expand=True) doesn't seem to work Jun 28, 2022
@GeoTN
Copy link
Author

GeoTN commented Jun 28, 2022

Here

[main.txt](https://github.com/py-pdf/PyPDF2/files/9002217/main.txt
I'd like to fuse those page as :
| 1 | 2 |

@MartinThoma
Copy link
Member

Thanks for sharing! It looks to me as if you're merging them as

| 1 |
| 2 |

The x-axis goes (left-right), the y-axis goes (top-bottom).

@GeoTN
Copy link
Author

GeoTN commented Jun 28, 2022

Yep, but even with that, the | 1 | is still missing or partly written...

@MartinThoma
Copy link
Member

This is an intermediate fix:

from PyPDF2.generic import RectangleObject
from PyPDF2 import PdfReader, PdfWriter, Transformation

# Get the original data
reader = PdfReader("1.pdf")
page1 = reader.pages[0]
print(page1.cropbox)

reader2 = PdfReader("2.pdf")
page2 = reader2.pages[0]

# Merge them into one page
offset = page1.cropbox.right
print(offset)
op = Transformation().translate(tx=offset, ty=0)
page2.add_transformation(op)
cb = page2.cropbox
page2.mediabox = RectangleObject((cb.left + offset, cb.bottom, cb.right + offset, cb.top))
page2.cropbox = RectangleObject((cb.left + offset, cb.bottom, cb.right + offset, cb.top))
page2.trimbox = RectangleObject((cb.left + offset, cb.bottom, cb.right + offset, cb.top))
page2.bleedbox = RectangleObject((cb.left + offset, cb.bottom, cb.right + offset, cb.top))
page2.artbox = RectangleObject((cb.left + offset, cb.bottom, cb.right + offset, cb.top))

page1.merge_page(page2, expand=True)
mb = page1.mediabox
page1.mediabox = RectangleObject((mb.left, mb.bottom, mb.right, mb.top))
page1.cropbox = RectangleObject((mb.left, mb.bottom, mb.right, mb.top))
page1.trimbox = RectangleObject((mb.left, mb.bottom, mb.right , mb.top))
page1.bleedbox = RectangleObject((mb.left, mb.bottom, mb.right, mb.top))
page1.artbox = RectangleObject((mb.left, mb.bottom, mb.right, mb.top))
print(page1.mediabox )

# Write the result back
writer = PdfWriter()
writer.add_page(page1)

with open("output.pdf", "wb") as fp:
    writer.write(fp)

@MartinThoma
Copy link
Member

It's might related to #272 and #879

@MartinThoma
Copy link
Member

There are two issues in PyPDF2:

  1. When doing the transformation, one/multiple of the boxes don't get adjusted
  2. When doing the merge with expand=True, one/multiple of the boxes don't get adjusted.

@GeoTN
Copy link
Author

GeoTN commented Jun 28, 2022

Ok thanks a lot !
I will close this topic tomorrow when I try those :)
Have a good day.

@MartinThoma
Copy link
Member

Please leave the issue open, even if the quick-fix works for you. I want the library itself to be correct. At the very least I feel like we need more documentation here, but I'm pretty certain we need to adjust PyPDF2.

@MartinThoma
Copy link
Member

  1. When doing the transformation, one/multiple of the boxes don't get adjusted

On 10.07.2022 I've merged #1066 .

Transformations done via the Transformation class , e.g

>>> from PyPDF2 import Transformation
 >>> op = Transformation().scale(sx=2, sy=3).translate(tx=10, ty=20)
>>> page.add_transformation(op)

only affect the content of a page, not the page itself.

I've just noticed that this is not documented (e.g. in https://pypdf2.readthedocs.io/en/latest/user/cropping-and-transforming.html ). I'll fix that.

MartinThoma added a commit that referenced this issue Aug 6, 2022
MartinThoma added a commit that referenced this issue Aug 6, 2022
MartinThoma added a commit that referenced this issue Aug 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants