Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for merging a page in the background (watermark) #307

Closed
hvbtup opened this issue Dec 6, 2016 · 9 comments · Fixed by #1082
Closed

Support for merging a page in the background (watermark) #307

hvbtup opened this issue Dec 6, 2016 · 9 comments · Fixed by #1082
Labels
is-feature A feature request

Comments

@hvbtup
Copy link

hvbtup commented Dec 6, 2016

I propose adding an optional argument background=False to the various merge...Page methods.
All these methods finally go into _mergePage. There, the following little change should to the trick:

    newContentArray = ArrayObject()

    originalContent = self.getContents()
    if originalContent is not None:
        newContentArray.append(PageObject._pushPopGS(
              originalContent, self.pdf))

    page2Content = page2.getContents()
    if page2Content is not None:
        if page2transformation is not None:
            page2Content = page2transformation(page2Content)
        page2Content = PageObject._contentStreamRename(
            page2Content, rename, self.pdf)
        page2Content = PageObject._pushPopGS(page2Content, self.pdf)
        if background:
            newContentArray.insert(0, page2Content)
        else:
            newContentArray.append(page2Content)
@hvbtup
Copy link
Author

hvbtup commented Dec 6, 2016

This is the patch for PyPDF2/pdf.py.
pdf.patch.txt

@MartinThoma MartinThoma added the is-feature A feature request label Apr 8, 2022
MartinThoma added a commit that referenced this issue Jul 9, 2022
MartinThoma added a commit that referenced this issue Jul 9, 2022
@hvbtup
Copy link
Author

hvbtup commented Jul 10, 2022

I have not tested your code, but I think your example is probably too simple. The typical use case is:
You have a multi-page source PDF (say, 5 pages) and a stamp/watermark PDF (for example, containing a big text "for internal use" or "copy").
Now you want to create a destination PDF which consists of the source PDF plus the stamp above or the watermark below the content of each page.

Since your example code for creating a watermark modifies the image page in-place, I don't think it will work for this use case.

@MartinThoma
Copy link
Member

What do you think would go wrong?

(As an important side-note: You need to copy the watermark or re-load it every time, of course)

@hvbtup
Copy link
Author

hvbtup commented Jul 10, 2022

You need to copy the watermark or re-load it every time, of course

Would that be cumbersome and counter-intuitive?

I think the approach to specify wether the "image page" should go above or below the existing content with an optional argument is much more user-friendly.

BTW The well-known Java iText PDF library uses the words "OverContent" and "UnderContent" for this, but people don't understand that (https://stackoverflow.com/questions/66029258/difference-between-getundercontent-and-getovercontent-of-itext). IMHO the wording "background" for a boolean argument is more clear, but I'm biased of course.

@MartinThoma
Copy link
Member

An image says more than 1000 words: https://pypdf2.readthedocs.io/en/latest/user/add-watermark.html 😄

I want to keep the public interface of PyPDF2 small and hence I hesitate to add such features which are easy to do on the users side. But I get your point that users might not understand that they need to copy the page.

I think I'll wait and see how many people open stackoverflow questions / issues here. If this is really an issue, adding this feature is pretty easy.

@MartinThoma
Copy link
Member

Oh, and I should adjust the docs to make a copy so that people who simply copy-and-paste don't fall into this trap

@MartinThoma
Copy link
Member

#1095

@hvbtup
Copy link
Author

hvbtup commented Jul 10, 2022

It's good that you show a complete example in add-watermark.md.

Of course parsing the stamp file inside the loop with

reader_stamp = PdfReader(stamp_pdf)

will cost some performance. I hope this isnt' really an issue.

If it turns out to be an issue, one could still modify PyPDF2 to use my approach instead.

@MartinThoma
Copy link
Member

I completely agree that this is not ideal. I actually thought it would be simpler - using copy.deepcopy or maybe a method of PageObject that allows to create a copy of itself (e.g. page.clone() or similar).

Turns out, that page.clone() does not exist and deepcopy does not work :-/

MartinThoma added a commit that referenced this issue Jul 11, 2022
mtd91429 pushed a commit to mtd91429/PyPDF2 that referenced this issue Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-feature A feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants