Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attachments fail PDF/A validation #2052

Closed
kesara opened this issue Jan 31, 2024 · 3 comments
Closed

Attachments fail PDF/A validation #2052

kesara opened this issue Jan 31, 2024 · 3 comments
Labels
bug Existing features not working as expected
Milestone

Comments

@kesara
Copy link
Contributor

kesara commented Jan 31, 2024

This bug is in WeasyPrint's experimental PDF/A-3B support.

HTML attachments fail PDF/A-3B validation.
Due to multiple reasons:

  • Missing MIME type.
  • Missing relationship information.

Command:

weasyprint --pdf-identifier foobar  --pdf-variant "pdf/a-3b" foobar.html foobar.pdf

Test HTML:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Foobar</title>
    <link href="foobar.xml", rel="attachment", type="application/xml", title="source" />
</head>
<body>
    <h1>Hello, World!</h1>
</body>
</html>

foobar.xml:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <foobar />
</root>

verapdf output:

<?xml version="1.0" encoding="utf-8"?>
<report>
  <buildInformation>
    <releaseDetails id="core" version="1.24.1" buildDate="2023-06-22T10:38:00Z"></releaseDetails>
    <releaseDetails id="validation-model" version="1.24.1" buildDate="2023-06-22T11:37:00Z"></releaseDetails>
    <releaseDetails id="gui" version="1.24.1" buildDate="2023-06-22T14:19:00Z"></releaseDetails>
  </buildInformation>
  <jobs>
    <job>
      <item size="6402">
        <name>/docs/tmp/foobar.pdf</name>
      </item>
      <validationReport jobEndStatus="normal" profileName="PDF/A-3B validation profile" statement="PDF file is not compliant with Validation Profile requirements." isCompliant="false">
        <details passedRules="141" failedRules="3" passedChecks="529" failedChecks="6">
          <rule specification="ISO 19005-3:2012" clause="6.8" testNumber="3" status="failed" passedChecks="0" failedChecks="2">
            <description>In order to enable identification of the relationship between the file specification dictionary and the content that is referring to it, a new (required) key has been defined and its presence (in the dictionary) is required.</description>
            <object>CosFileSpecification</object>
            <test>AFRelationship != null</test>
            <check status="failed">
              <context>root/EmbeddedFiles[0]</context>
              <errorMessage>The file specification dictionary for an embedded file does not contain the AFRelationship key</errorMessage>
            </check>
            <check status="failed">
              <context>root/indirectObjects[11](10 0)/directObject[0]</context>
              <errorMessage>The file specification dictionary for an embedded file does not contain the AFRelationship key</errorMessage>
            </check>
          </rule>
          <rule specification="ISO 19005-3:2012" clause="6.8" testNumber="1" status="failed" passedChecks="0" failedChecks="2">
            <description>The MIME type of an embedded file, or a subset of a file, shall be specified using the Subtype key of the file specification dictionary. If the MIME type is not known, the "application/octet-stream" shall be used.</description>
            <object>EmbeddedFile</object>
            <test>Subtype != null &amp;&amp; /^[-\w+\.]+\/[-\w+\.]+$/.test(Subtype)</test>
            <check status="failed">
              <context>root/EmbeddedFiles[0]/EF[0]</context>
              <errorMessage>MIME type null of an embedded file is missing or invalid</errorMessage>
            </check>
            <check status="failed">
              <context>root/indirectObjects[11](10 0)/directObject[0]/EF[0]</context>
              <errorMessage>MIME type null of an embedded file is missing or invalid</errorMessage>
            </check>
          </rule>
          <rule specification="ISO 19005-3:2012" clause="6.8" testNumber="4" status="failed" passedChecks="0" failedChecks="2">
            <description>The additional information provided for associated files as well as the usage requirements for associated files indicate the relationship between the embedded file and the PDF document or the part of the PDF document with which it is associated.</description>
            <object>CosFileSpecification</object>
            <test>isAssociatedFile == true</test>
            <check status="failed">
              <context>root/EmbeddedFiles[0]</context>
              <errorMessage>The file specification dictionary for an embedded file is not associated with the PDF document or any of its parts</errorMessage>
            </check>
            <check status="failed">
              <context>root/indirectObjects[11](10 0)/directObject[0]</context>
              <errorMessage>The file specification dictionary for an embedded file is not associated with the PDF document or any of its parts</errorMessage>
            </check>
          </rule>
        </details>
      </validationReport>
      <duration start="1706690170003" finish="1706690170736">00:00:00.733</duration>
    </job>
  </jobs>
  <batchSummary totalJobs="1" failedToParse="0" encrypted="0" outOfMemory="0" veraExceptions="0">
    <validationReports compliant="0" nonCompliant="1" failedJobs="0">1</validationReports>
    <featureReports failedJobs="0">0</featureReports>
    <repairReports failedJobs="0">0</repairReports>
    <duration start="1706690169883" finish="1706690170819">00:00:00.936</duration>
  </batchSummary>
</report>
@liZe
Copy link
Member

liZe commented Jan 31, 2024

Hi!

There’s a pull request open to fix this #1869, and we’d like to have real-life testers to merge it. Would you be interested in testing the fix (and the overall PDF/A quality)?

@kesara
Copy link
Contributor Author

kesara commented Jan 31, 2024

@liZe I've tested #1869 with the above test case and a pdf of an RFC. It's generating PDF/A-3B compliant documents.

@liZe liZe added this to the 61.0 milestone Feb 2, 2024
@liZe liZe added the bug Existing features not working as expected label Feb 2, 2024
@liZe
Copy link
Member

liZe commented Feb 8, 2024

Fixed by #1869.

@liZe liZe closed this as completed Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing features not working as expected
Projects
None yet
Development

No branches or pull requests

2 participants