DOC: Robustness (py-pdf#785)

VictorCarlquist · Apr 29, 2022 · 21b5294 · 21b5294
1 parent 6870b65
commit 21b5294
Show file tree

Hide file tree

Showing 3 changed files with 43 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 [![PyPI version](https://badge.fury.io/py/PyPDF2.svg)](https://badge.fury.io/py/PyPDF2)
 [![Python Support](https://img.shields.io/pypi/pyversions/PyPDF2.svg)](https://pypi.org/project/PyPDF2/)
 [![](https://img.shields.io/badge/-documentation-green)](https://pypdf2.readthedocs.io/en/latest/)
-![GitHub last commit](https://img.shields.io/github/last-commit/py-pdf/PyPDF2)
+[![GitHub last commit](https://img.shields.io/github/last-commit/py-pdf/PyPDF2)](https://github.com/py-pdf/PyPDF2)
 [![codecov](https://codecov.io/gh/py-pdf/PyPDF2/branch/main/graph/badge.svg?token=id42cGNZ5Z)](https://codecov.io/gh/py-pdf/PyPDF2)
 
 # PyPDF2

diff --git a/docs/index.rst b/docs/index.rst
@@ -19,6 +19,7 @@ You can contribute to `PyPDF2 on Github <https://github.com/py-pdf/PyPDF2>`_.
    :maxdepth: 1
 
    user/installation
+   user/robustness
    user/metadata
    user/extract-text
    user/encryption-decryption
@@ -36,9 +37,9 @@ You can contribute to `PyPDF2 on Github <https://github.com/py-pdf/PyPDF2>`_.
    :maxdepth: 1
 
    modules/PdfFileReader
+   modules/PdfFileWriter
    modules/PdfFileMerger
    modules/PageObject
-   modules/PdfFileWriter
    modules/DocumentInformation
    modules/XmpInformation
    modules/Destination

diff --git a/docs/user/robustness.md b/docs/user/robustness.md
@@ -0,0 +1,40 @@
+# Robustness and strict=False
+
+PDF is [specified in various versions](https://www.pdfa.org/resource/pdf-specification-index/).
+The specification of PDF 1.7 has 978 pages. This length makes it hard to get
+everything right. As a consequence, a lot of PDF are not strictly following the
+specification.
+
+If a PDF file does not follow the specification, it is not always possible to
+be certain what the intended effect would be. Think of the following broken
+Python code as an example:
+
+```python
+# Broken
+function (foo, bar):
+
+# Potentially intendet:
+def function(foo, bar):
+    ...
+
+# Also possible:
+function = (foo, bar)
+```
+
+Writing a parser you can go two paths: Either you try to be forgiving and try
+to figure out what the user intendet, or you are strict and just tell the user
+that they should fix their stuff.
+
+PyPDF2 gives you the option to be strict or not.
+
+PyPDF2 has three core objects and all of them have a `strict` parameter:
+
+* [`PdfFileReader`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileReader.html)
+* [`PdfFileWriter`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileWriter.html)
+* [`PdfFileMerger`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileMerger.html)
+
+Choosing `strict=True` means that PyPDF2 will raise an exception if a PDF does
+not follow the specification.
+
+Choosing `strict=False` means that PyPDF2 will try to be forgiving and do
+something reasonable, but it will log a warning message.