Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify hybrid-reference PDFs #19

Open
petervwyatt opened this issue Sep 7, 2022 · 4 comments
Open

Identify hybrid-reference PDFs #19

petervwyatt opened this issue Sep 7, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@petervwyatt
Copy link
Member

TestGrammar (C++ PoC) currently only reports on traditional style xref tables or cross-reference streams. Should expand to also identify hybrid reference PDFs, even though they are relatively rare:

7.5.8.4 Compatibility with applications that do not support compressed reference streams

A hybrid-reference PDF file is readable by PDF processors designed only to support versions of PDF before PDF 1.5. Such a PDF file contains objects referenced by standard cross-reference tables in addition to objects in object streams that are referenced by cross-reference streams.

@petervwyatt petervwyatt self-assigned this Sep 7, 2022
@petervwyatt
Copy link
Member Author

Thanks @tballison for the question that prompted this!

@petervwyatt
Copy link
Member Author

Having trouble working out how to do this with all PDF SDKs...

@petervwyatt
Copy link
Member Author

Can be identified by the presence of XRefStm key as per Table 19 (and Note below Table 15).

Fix for Issue #39 means that pdfium will now report:

...
       1:   Trailer (as XRefStream)
Info: unknown key 'XRefStm' is not defined in Arlington for XRefStream in PDF 1.7

PDFix does not report anything currently.

@petervwyatt
Copy link
Member Author

Example hybrid PDF: https://www.ema.europa.eu/documents/product-information/rapamune-epar-product-information_en.pdf
(has other issues also, just to add to the fun 😁)

@petervwyatt petervwyatt added this to the TestGrammar C++ PoC milestone Aug 13, 2023
@petervwyatt petervwyatt added the enhancement New feature or request label Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant