Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Handle XML error when reading XmpInformation #1030

Merged
merged 4 commits into from
Jun 30, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion PyPDF2/xmp.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@
from xml.dom.minidom import Document
from xml.dom.minidom import Element as XmlElement
from xml.dom.minidom import parseString
from xml.parsers.expat import ExpatError

from PyPDF2.errors import PdfReadError

from ._utils import StreamType, deprecate_with_replacement
from .generic import ContentStream, PdfObject
Expand Down Expand Up @@ -199,11 +202,17 @@ class XmpInformation(PdfObject):
"""
An object that represents Adobe XMP metadata.
Usually accessed by :py:attr:`xmp_metadata()<PyPDF2.PdfReader.xmp_metadata>`

:raises: PdfReadError if XML is invalid
"""

def __init__(self, stream: ContentStream) -> None:
self.stream = stream
doc_root: Document = parseString(self.stream.get_data())
try:
data = self.stream.get_data()
doc_root: Document = parseString(data)
except ExpatError:
raise PdfReadError("XML in XmpInformation was invalid")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider including diagnostic information contained in the ExpactError object (line, offset, kind of XML issue)/

Example:

except ExpatError as e:
    print("Got some XML error. " + str(e))

self.rdf_root: XmlElement = doc_root.getElementsByTagNameNS(
RDF_NAMESPACE, "RDF"
)[0]
Expand Down