Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Robustness #785

Merged
merged 1 commit into from
Apr 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[![PyPI version](https://badge.fury.io/py/PyPDF2.svg)](https://badge.fury.io/py/PyPDF2)
[![Python Support](https://img.shields.io/pypi/pyversions/PyPDF2.svg)](https://pypi.org/project/PyPDF2/)
[![](https://img.shields.io/badge/-documentation-green)](https://pypdf2.readthedocs.io/en/latest/)
![GitHub last commit](https://img.shields.io/github/last-commit/py-pdf/PyPDF2)
[![GitHub last commit](https://img.shields.io/github/last-commit/py-pdf/PyPDF2)](https://github.com/py-pdf/PyPDF2)
[![codecov](https://codecov.io/gh/py-pdf/PyPDF2/branch/main/graph/badge.svg?token=id42cGNZ5Z)](https://codecov.io/gh/py-pdf/PyPDF2)

# PyPDF2
Expand Down
3 changes: 2 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ You can contribute to `PyPDF2 on Github <https://github.com/py-pdf/PyPDF2>`_.
:maxdepth: 1

user/installation
user/robustness
user/metadata
user/extract-text
user/encryption-decryption
Expand All @@ -36,9 +37,9 @@ You can contribute to `PyPDF2 on Github <https://github.com/py-pdf/PyPDF2>`_.
:maxdepth: 1

modules/PdfFileReader
modules/PdfFileWriter
modules/PdfFileMerger
modules/PageObject
modules/PdfFileWriter
modules/DocumentInformation
modules/XmpInformation
modules/Destination
Expand Down
40 changes: 40 additions & 0 deletions docs/user/robustness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Robustness and strict=False

PDF is [specified in various versions](https://www.pdfa.org/resource/pdf-specification-index/).
The specification of PDF 1.7 has 978 pages. This length makes it hard to get
everything right. As a consequence, a lot of PDF are not strictly following the
specification.

If a PDF file does not follow the specification, it is not always possible to
be certain what the intended effect would be. Think of the following broken
Python code as an example:

```python
# Broken
function (foo, bar):

# Potentially intendet:
def function(foo, bar):
...

# Also possible:
function = (foo, bar)
```

Writing a parser you can go two paths: Either you try to be forgiving and try
to figure out what the user intendet, or you are strict and just tell the user
that they should fix their stuff.

PyPDF2 gives you the option to be strict or not.

PyPDF2 has three core objects and all of them have a `strict` parameter:

* [`PdfFileReader`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileReader.html)
* [`PdfFileWriter`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileWriter.html)
* [`PdfFileMerger`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileMerger.html)

Choosing `strict=True` means that PyPDF2 will raise an exception if a PDF does
not follow the specification.

Choosing `strict=False` means that PyPDF2 will try to be forgiving and do
something reasonable, but it will log a warning message.