Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'DictionaryObject' object has no attribute 'indirect_reference' #1614

Closed
MonsterDruide1 opened this issue Feb 6, 2023 · 9 comments · Fixed by #1616
Closed
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF

Comments

@MonsterDruide1
Copy link

I'm trying to merge a few PDF files using this library into one single file by just appending them into a new document one-by-one. However, some of the input files seem to run into the error described in the title above when trying to call append(fileobj=open(...)).

I suspect it has something to do with the image shown on the example PDF... it might not be "imported" properly or something, as it also disappears from the slide when trying to use the "Black"-Feature of Adobe Acrobat.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.15.79.1-microsoft-standard-WSL2-x86_64-with-glibc2.35

$ python -c "import pypdf;print(pypdf.__version__)"
3.4.0

Code + PDF

This is a minimal, complete example that shows the issue:

output_pdf_stream = PdfWriter()
output_pdf_stream.append(fileobj=open("broke.pdf", "rb"))
Full example file
from pypdf import errors, PdfWriter

def main():
    output_pdf_stream = PdfWriter()
    output_pdf_stream.append(fileobj=open("broke.pdf", "rb"))


    # does not reach here
    # create output pdf file
    try:
        output_pdf_file = open("test.pdf", "wb")
        output_pdf_stream.write(output_pdf_file)
    finally:
        output_pdf_file.close()

    output_pdf_stream.close()

    print("%s successfully created." % output_pdf_name)


if __name__ == "__main__":
    main()

I don't own the right to any of the contents on this slide, but I'm sure some of your magicians can craft up a similar file for the testing environments.
broke.pdf

Traceback

This is the complete Traceback I see:

Traceback (most recent call last):
  File "/mnt/d/Eigene_Dateien/Downloads/Netzsicherheit 1 (212012-WiSe2223)/minimal.py", line 22, in <module>
    main()
  File "/mnt/d/Eigene_Dateien/Downloads/Netzsicherheit 1 (212012-WiSe2223)/minimal.py", line 5, in main
    output_pdf_stream.append(fileobj=open("broke.pdf", "rb"))
  File "/usr/local/lib/python3.10/dist-packages/pypdf/_writer.py", line 2476, in append
    self.merge(
  File "/usr/local/lib/python3.10/dist-packages/pypdf/_utils.py", line 441, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pypdf/_writer.py", line 2599, in merge
    lst = self._insert_filtered_annotations(
  File "/usr/local/lib/python3.10/dist-packages/pypdf/_writer.py", line 2768, in _insert_filtered_annotations
    outlist.append(ano.clone(self).indirect_reference)
AttributeError: 'DictionaryObject' object has no attribute 'indirect_reference'
@MonsterDruide1
Copy link
Author

Note that very few files show this weird behaviour. Out of the test set of 60 similar files by the same author, only 2 of them had the issue described above.

@pubpub-zz
Copy link
Collaborator

The issue is caused by the /annots array containing (direct) Dictionary Object and not indirect Object. I've improved robustness. Can you confirm the PR fixes all your cases ?

@MonsterDruide1
Copy link
Author

Sorry, I'm not too familiar with python environments - how can I install that custom version for testing?

@pubpub-zz
Copy link
Collaborator

The easiest solution I propose:
edit _writer.py in pypdf at line 2768 to replace:
outlist.append(ano.clone(self).indirect_reference)
with
outlist.append(self._add_object(ano.clone(self)))

@MonsterDruide1
Copy link
Author

Okay, got it to work with previously manually uninstalling pypdf, then using this command to install again:
sudo pip install git+https://github.com/pubpub-zz/PyPDF2@iss1614

Seems to have fully solved the problem, thanks!

@pubpub-zz pubpub-zz added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF soon PRs that are almost ready to be merged, issues that get solved pretty soon labels Feb 7, 2023
@mchesterkadwell
Copy link

mchesterkadwell commented Feb 9, 2023

I have come across the same issue, but failing on a different line of code in _writer.py. I have installed the fixed version, as above, but this has not fixed it for me.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
macOS-10.15.7-x86_64-i386-64bit

$ python -c "import pypdf;print(pypdf.__version__)"
3.4.0

Code + PDF

This is a minimal, complete example that shows the issue:

output_pdf_stream = PdfWriter()
output_pdf_stream.append(fileobj=open("broken.pdf", "rb"))

Full example file

from pypdf import PdfWriter

def main():
    output_pdf_name = "test.pdf"
    output_pdf_file = open(output_pdf_name, "wb")

    output_pdf_stream = PdfWriter()
    output_pdf_stream.append(fileobj=open("broken.pdf", "rb"))

    # does not reach here
    try:
        output_pdf_stream.write(output_pdf_file)
    finally:
        output_pdf_file.close()
        output_pdf_stream.close()

    print("%s successfully created." % output_pdf_name)

if __name__ == '__main__':
    main()

broken.pdf

Traceback

Traceback (most recent call last):
  File "/Users/mary/project/main.py", line 25, in <module>
    main()
  File "/Users/mary/project/main.py", line 11, in main
    output_pdf_stream.append(fileobj=open("broken.pdf", "rb"))
  File "/Users/mary/project/venv/lib/python3.10/site-packages/pypdf/_writer.py", line 2476, in append
    self.merge(
  File "/Users/mary/project/venv/lib/python3.10/site-packages/pypdf/_utils.py", line 441, in wrapper
    return func(*args, **kwargs)
  File "/Users/mary/project/venv/lib/python3.10/site-packages/pypdf/_writer.py", line 2614, in merge
    .indirect_reference
AttributeError: 'DictionaryObject' object has no attribute 'indirect_reference'

@pubpub-zz
Copy link
Collaborator

pubpub-zz commented Feb 9, 2023

@mchesterkadwell
Thanks for your report. I will complete the PR.

edit: PR is available if you want to try it

pubpub-zz added a commit to pubpub-zz/pypdf that referenced this issue Feb 9, 2023
@MartinThoma MartinThoma removed the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Feb 26, 2023
@jainharsh97
Copy link

jainharsh97 commented Mar 18, 2023

Hey @MartinThoma,
I am currently facing the same issue at _writer.py at https://github.com/py-pdf/pypdf/blob/main/pypdf/_writer.py#L2876

@pubpub-zz
Copy link
Collaborator

@jainharsh97,
can you raise a new issue to track it with all details in it. this issue and the associated PR are closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants