Skip to content

Commit

Permalink
BUG: Use 1MB as offset for readNextEndLine (#321)
Browse files Browse the repository at this point in the history
Try to find “%%EOF” in last 1Mb of file.

This fixes the issue with reading Selenium-generated PDF files.

Closes #177
Closes #442
Closes #480
  • Loading branch information
akolpakov authored Apr 21, 2022
1 parent b36a564 commit db1e458
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions PyPDF2/pdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -1820,12 +1820,12 @@ def read(self, stream):
stream.seek(-1, 2)
if not stream.tell():
raise PdfReadError('Cannot read an empty file')
last1K = stream.tell() - 1024 + 1 # offset of last 1024 bytes of stream
last1M = stream.tell() - 1024 * 1024 + 1 # offset of last MB of stream
line = b_('')
while line[:5] != b_("%%EOF"):
if stream.tell() < last1K:
if stream.tell() < last1M:
raise PdfReadError("EOF marker not found")
line = self.readNextEndLine(stream, last1K)
line = self.readNextEndLine(stream)
if debug: print(" line:",line)

# find startxref entry - the location of the xref table
Expand Down

0 comments on commit db1e458

Please sign in to comment.