forked from py-pdf/pypdf
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6870b65
commit 21b5294
Showing
3 changed files
with
43 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Robustness and strict=False | ||
|
||
PDF is [specified in various versions](https://www.pdfa.org/resource/pdf-specification-index/). | ||
The specification of PDF 1.7 has 978 pages. This length makes it hard to get | ||
everything right. As a consequence, a lot of PDF are not strictly following the | ||
specification. | ||
|
||
If a PDF file does not follow the specification, it is not always possible to | ||
be certain what the intended effect would be. Think of the following broken | ||
Python code as an example: | ||
|
||
```python | ||
# Broken | ||
function (foo, bar): | ||
|
||
# Potentially intendet: | ||
def function(foo, bar): | ||
... | ||
|
||
# Also possible: | ||
function = (foo, bar) | ||
``` | ||
|
||
Writing a parser you can go two paths: Either you try to be forgiving and try | ||
to figure out what the user intendet, or you are strict and just tell the user | ||
that they should fix their stuff. | ||
|
||
PyPDF2 gives you the option to be strict or not. | ||
|
||
PyPDF2 has three core objects and all of them have a `strict` parameter: | ||
|
||
* [`PdfFileReader`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileReader.html) | ||
* [`PdfFileWriter`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileWriter.html) | ||
* [`PdfFileMerger`](https://pypdf2.readthedocs.io/en/latest/modules/PdfFileMerger.html) | ||
|
||
Choosing `strict=True` means that PyPDF2 will raise an exception if a PDF does | ||
not follow the specification. | ||
|
||
Choosing `strict=False` means that PyPDF2 will try to be forgiving and do | ||
something reasonable, but it will log a warning message. |