Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'pageObject' object has no attribute 'has_key' #640

Closed
gilramanoj opened this issue Oct 4, 2021 · 5 comments
Closed

AttributeError: 'pageObject' object has no attribute 'has_key' #640

gilramanoj opened this issue Oct 4, 2021 · 5 comments
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF

Comments

@gilramanoj
Copy link

pyPDF2 is throwing the attached error

pyPDF2

Actually, I need to list out broken links in the PDF file. Please suggest.

Regards,
Manoj

@Joshua-IRT
Copy link

Which version of Python are you using? The .has_key method was removed in Python 3.0: https://docs.python.org/3.1/whatsnew/3.0.html#builtins

@gilramanoj
Copy link
Author

gilramanoj commented Oct 5, 2021 via email

@MartinThoma MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Apr 6, 2022
@johns1c
Copy link

johns1c commented Apr 7, 2022

Dear Manoj,

can you attach or give links to sample PDFs with ideally good and broken links and I will have a go at testing this.

Chris Johnson

@pubpub-zz
Copy link
Collaborator

@gilramanoj,
Can you please provide PDF and code to investigate. Thanks

@MasterOdin
Copy link
Member

MasterOdin commented May 22, 2022

Closing this as it's an issue in user's code, not PyPDF2 itself.

As @Joshua-IRT indicates, the .has_key method on dictionary like objects (which PageObject is) was removed as part of python3 as it's recommended to use the in operator instead.

An example:

>>> from PyPDF2 import PdfReader
>>> pdf = PdfReader('PDF_Samples/GeoBase_NHNC1_Data_Model_UML_EN.pdf')
>>> pageObject = pdf.pages[0]
>>> pageObject
{'/Type': '/Page', '/Parent': IndirectObject(2, 0), '/Resources': {'/Font': {'/F1': IndirectObject(5, 0), '/F2': IndirectObject(8, 0), '/F3': IndirectObject(10, 0), '/F4': IndirectObject(12, 0), '/F5': IndirectObject(17, 0)}, '/XObject': {'/Image7': IndirectObject(7, 0), '/Image21': IndirectObject(21, 0)}, '/ProcSet': ['/PDF', '/Text', '/ImageB', '/ImageC', '/ImageI']}, '/Annots': [IndirectObject(19, 0), IndirectObject(20, 0)], '/MediaBox': [0, 0, 612, 792], '/Contents': IndirectObject(4, 0), '/Group': {'/Type': '/Group', '/S': '/Transparency', '/CS': '/DeviceRGB'}, '/Tabs': '/S', '/StructParents': 0}
>>> 'foo' in pageObject
False
>>> '/Type' in pageObject
True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Projects
None yet
Development

No branches or pull requests

6 participants