-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML links to document page broken after merge #471
Comments
I second this issue. It occurs when merging even via the pageObject.mergePage method |
I'll jump on the train - having the same issue here! |
Can anybody share a PDF which shows this issue? Is it still an issue with the latest PyPDF2 version? |
@MartinThoma Not sure if you're still looking for an example, but you can find one below: I have generated from PyPDF2 import PdfReader, PdfWriter
PDF = "./doc/_build/pdf/book.pdf"
OUT_PDF = "./doc/_build/pdf/book_out.pdf"
reader = PdfReader(PDF)
writer = PdfWriter()
for page in reader.pages:
writer.addPage(page)
with open(OUT_PDF, "wb") as f:
writer.write(f) |
Thank you @SimplyOm 🤗 |
Hi @MartinThoma , is this solved? facing same issue |
In progress, should come back soon |
I think the issue is found : |
On a relevant issue, when using merge, the internal links of a pdf seem to be broken. I refer to links, for example, to a reference at the end of the pdf in a research paper or to a section of the paper. Any ideas on how to keep those links active when mergin? |
The method `.clone(pdf_dest,[force_duplicate])` clones the objects and all referenced objects. If an object is already cloned, the already cloned object is returned (unless force_duplicate is set) mainly for internal use but can be used on a page for pageObject/DictionnaryObject/[Encoded/Decoded/Content]Stream an extra parameter ignore_fields list that provide the list of fields that should not be cloned. When available, the pointer to an object is available in `indirect_obj` attribute. New API for add_page/insert_page that : * returns the cloned page object * ignore_fields can be provided as a parameter. ## Others * file is closed at the end of PdfWriter.write when a filename is provided * Breaking Change: `add_outline_item` now has a parameter before which is not the last parameter ## Update * The public API of PdfMerger has been added to PdfWriter (ready to make PdfMerger an alias of it) * Process properly Outline merging * Process properly Named destinated Deals with #1194, #1322, #471, #1337
@manathan1984, |
writer = PdfWriter()
for pdf in ["cover_page.pdf", "main_report.pdf", "back_cover.pdf"]:
writer.append(pdf)
with open("result.pdf", "wb") as f:
writer.write(f) getting below error when using PdfWriter and append() .
|
@DX9807 |
@pubpub-zz While trying to merge the above pdfs using PdfWriter and its append method I am getting this error. AttributeError: 'NumberObject' object has no attribute 'indirect_reference' But when I use PdfMerger class and the corresponding append method the pdfs get merged but the internal hyperlinks are not |
closes py-pdf#471 the issue was with named destination using numbers instead of indirect object to point pages. This is normally not expected.
Hello, it is included in 3.16.0? |
If you have a look at the last commit referenced here (b1fa953), you will see that this fix is included since version 3.11.1. |
If you have links in PDF file (html anchor tag with element id as href) they would not work after merging.
The text was updated successfully, but these errors were encountered: