Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyPDF2.utils.PdfReadError: Creating EncodedStreamObject is not currently supported #656

Closed
ghost opened this issue Mar 13, 2022 · 3 comments · Fixed by #1854
Closed

PyPDF2.utils.PdfReadError: Creating EncodedStreamObject is not currently supported #656

ghost opened this issue Mar 13, 2022 · 3 comments · Fixed by #1854
Labels
is-feature A feature request

Comments

@ghost
Copy link

ghost commented Mar 13, 2022

My task is to find and replace the text in pdf, I used pyPDF2 package to replace the text, but when I try to replace I'm receiving an error like

Traceback (most recent call last):
  File "c:\practice_python\sample.py", line 41, in <module>
    page.getContents().setData(replaced_text)
  File "C:\Users\Win\AppData\Local\Programs\Python\Python310\lib\site-packages\PyPDF2\generic.py", line 852, in setData
    raise utils.PdfReadError("Creating EncodedStreamObject is not currently supported")
PyPDF2.utils.PdfReadError: Creating EncodedStreamObject is not currently supported

and my code is

from PyPDF2 import PdfFileReader, PdfFileWriter

replacements = [
    ("HARIHARAN S", "<your name>")
]

pdf = PdfFileReader(open("samplenda.pdf", "rb"))
writer = PdfFileWriter() 

for page in pdf.pages:
    contents = page.getContents().getData()
    print(type(contents))
    for (a,b) in replacements:
        replaced_text = contents.replace(bytes(a,'utf-8'), bytes(b,'utf-8'))  # .encode('utf-8')
    print(type(replaced_text))
    page.getContents().setData(replaced_text)
    writer.addPage(page)
    
with open("modified.pdf", "wb") as f:
     writer.write(f)

I tried lots of way many times, please help me to solve this error

@MartinThoma MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Apr 6, 2022
@MartinThoma MartinThoma added the is-feature A feature request label Apr 16, 2022
@MartinThoma MartinThoma removed the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Mar 15, 2023
pubpub-zz added a commit to pubpub-zz/pypdf that referenced this issue May 22, 2023
add set_data() for encoded streams
also, complete  FlateEncode to get all requierd attributes
Ease data manipulation without going through ContentStream (slow)
closes py-pdf#656
@pubpub-zz
Copy link
Collaborator

pubpub-zz commented May 22, 2023

I've produced a PR to introduce set_data() into EncodedStreamObject() however not that get_contents() returns a ContentStream Object which where data is processed through operations(). If you want to get content as EncodedStreamObject, you have to access ["/Contents"] data, holding the possible array decomposition.

@SpastBanana
Copy link

Hi,

Is there any update on this? I got the same error as @ghost

@stefan6419846
Copy link
Collaborator

Please open a new issue, filling all the necessary details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-feature A feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants