Corrupted PDF in 1.26 when older versions throw "ValueError: I/O operation on closed file" #263

TheFrostyboss · 2016-05-19T16:22:56Z

So I was using v 1.26 to splice/merge different pages of different PDFs together, and invariably most of the pages would be corrupted when I output them (e.g. reader would display a blank page and say "There is an error displaying this page", or some pages might retain the same structure as the original PDF, but portions of the page would be populated with random text).

When I rolled back to 1.24-1.25, the same code produced a "ValueError: I/O operation on closed file", presumably for trying to write out pages from PDFs I had closed after reading them in.

Basically, trying to debug this kind of error in the current version is extremely difficult, especially to someone not familiar with PyPDF2. That is, I would always rather get some kind of error to point me in the right direction, rather than just a corrupted output.

The original code:

full_path = r"path_to_pdf.pdf"
full_path2 = full_path.replace(".pdf", "2.pdf")
output_pdf = PdfFileWriter()
py_open_file = open(full_path, "rb") 
with open(full_path, "rb") as f:
    open_pdf = PdfFileReader(f)
    last_page = len(open_pdf.pages)-1
    output_pdf.addPage(open_pdf.getPage(last_page))

outputStream = file(full_path2, "wb")
output_pdf.write(outputStream)
outputStream.close()

To resolve the error, I replaced the with/open statement with:

f = open(full_path, "rb")

The text was updated successfully, but these errors were encountered:

mstamy2 · 2016-05-19T22:23:10Z

26e5077 should allow a graceful exit as in prior to v1.26.0.

I suppose it's a little counterintuitive that input files must remain open during the write process...

* commit '036789a4664e3f572292bc7dceec10f08b7dbf62': Write binary data comment Python 3 type fixes in LZWDecode Appropriate error message for closed file, warn when returning null object, resolves py-pdf#263 Read Indirect Objects with a sign, fixes py-pdf#248 Version 1.26.0 update Fix a bug in _readInlineImage. We were looking for the operation EI and Q, but were not checking to ensure that there was whitespace between EI and Q. Accordingly, any image that had EIQ in its ascii encoded data would trigger the end of the image, and cause errors. Remove extraneous zeros from the standard formatting. Remove extraneous zeros from the standard formatting. Ignore xref table zero index error if self.strict = False Working around unresolved objects and returning NullObject instead of raising a ValueError. Python 3 compatibility with inline images Python2/3 compatibility on merging pages with eps img into single page Adding unit tests for addJS. Parameterized JavaScript. Added convenience method for retrieving form text fields

mstamy2 closed this as completed in 26e5077 May 19, 2016

RussellLuo mentioned this issue Apr 11, 2019

new PDF file has proper bookmarks but blank content RussellLuo/pdfbookmarker#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrupted PDF in 1.26 when older versions throw "ValueError: I/O operation on closed file" #263

Corrupted PDF in 1.26 when older versions throw "ValueError: I/O operation on closed file" #263

TheFrostyboss commented May 19, 2016 •

edited

Loading

mstamy2 commented May 19, 2016

Corrupted PDF in 1.26 when older versions throw "ValueError: I/O operation on closed file" #263

Corrupted PDF in 1.26 when older versions throw "ValueError: I/O operation on closed file" #263

Comments

TheFrostyboss commented May 19, 2016 • edited Loading

mstamy2 commented May 19, 2016

TheFrostyboss commented May 19, 2016 •

edited

Loading