-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'utf-16-be' codec can't decode byte 0xXY in position Z: truncated data #988
Labels
Has MCVE
A minimal, complete and verifiable example helps a lot to debug / understand feature requests
is-robustness-issue
From a users perspective, this is about robustness
workflow-text-extraction
From a users perspective, text extraction is the affected feature/workflow
Comments
MartinThoma
added
is-bug
From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
workflow-text-extraction
From a users perspective, text extraction is the affected feature/workflow
Has MCVE
A minimal, complete and verifiable example helps a lot to debug / understand feature requests
labels
Jun 14, 2022
MartinThoma
added
is-robustness-issue
From a users perspective, this is about robustness
and removed
is-bug
From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
labels
Jun 14, 2022
The data does not respect the expected encoding. robustness inprovement proposed in ref PR |
pubpub-zz
added a commit
to pubpub-zz/pypdf
that referenced
this issue
Jun 14, 2022
the data bytes are not matching encoding expectation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Has MCVE
A minimal, complete and verifiable example helps a lot to debug / understand feature requests
is-robustness-issue
From a users perspective, this is about robustness
workflow-text-extraction
From a users perspective, text extraction is the affected feature/workflow
When trying to extract the text from a PDF, I get an exception.
Environment
$ python -m platform Linux-5.4.0-113-generic-x86_64-with-glibc2.31 $ python -c "import PyPDF2;print(PyPDF2.__version__)" 2.2.0
MCVE
This is a minimal, complete example that shows the issue with the pdf 971703.pdf:
The text was updated successfully, but these errors were encountered: