-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't store complex data in PDFDocument.formInfo
, and replace the fields
object with a hasFields
boolean instead
#12483
Don't store complex data in PDFDocument.formInfo
, and replace the fields
object with a hasFields
boolean instead
#12483
Conversation
…fields` object with a `hasFields` boolean instead *This patch is based on a couple of smaller things that I noticed when working on PR 12479.* - Don't store the /Fields on the `formInfo` getter, since that feels like overloading it with unintended (and too complex) data, and utilize a `hasFields` boolean instead. This functionality was originally added in PR 12271, to help determine what kind of form data a PDF document contains, and I think that we should ensure that the return value of `formInfo` only consists of "simple" data. With these changes the `fieldObjects` getter instead has to look-up the /Fields manually, however that shouldn't be a problem since the access is guarded by a `formInfo.hasFields` check which ensures that the data both exists and is valid. Furthermore, most documents doesn't even have any /AcroForm data anyway. - Determine the `hasFields` property *first*, to ensure that it's always correct even if there's errors when checking e.g. the /XFA or /SigFlags entires, since the `fieldObjects` getter depends on it. - Simplify a loop in `fieldObjects`, since the object being accessed is a `Map` and those have built-in iteration support. - Use a higher logging level for errors in the `formInfo` getter, and include the actual error message, since that'd have helped with fixing PR 12479 a lot quicker. - Update the JSDoc comment in `src/display/api.js` to list the return values correctly, and also slightly extend/improve the description.
…ethod If this method is ever passed invalid/unexpected data, or if during the course of parsing (since it's used recursively) such data is found, it will fail in a non-graceful way. Hence this patch, which ensures that we don't attempt to access non-existent properties and also that errors such as the one fixed in PR 12479 wouldn't have occured.
5c8a07b
to
29af15f
Compare
From: Bot.io (Linux m4)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/7e67a54fdbe408f/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/c33de2a33d093d1/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.67.70.0:8877/7e67a54fdbe408f/output.txt Total script time: 4.20 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/c33de2a33d093d1/output.txt Total script time: 5.68 mins
|
From: Bot.io (Linux m4)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/324e5d2b0c0ac40/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/218e81a0d1c189a/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.67.70.0:8877/324e5d2b0c0ac40/output.txt Total script time: 4.26 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/218e81a0d1c189a/output.txt Total script time: 5.25 mins
|
Thank you for finetuning this! |
This patch is based on a couple of smaller things that I noticed when working on PR #12479.
Don't store the /Fields on the
formInfo
getter, since that feels like overloading it with unintended (and too complex) data, and utilize ahasFields
boolean instead.This functionality was originally added in PR Improve AcroForm/XFA form type detection #12271, to help determine what kind of form data a PDF document contains, and I think that we should ensure that the return value of
formInfo
only consists of "simple" data.With these changes the
fieldObjects
getter instead has to look-up the /Fields manually, however that shouldn't be a problem since the access is guarded by aformInfo.hasFields
check which ensures that the data both exists and is valid. Furthermore, most documents doesn't even have any /AcroForm data anyway.Determine the
hasFields
property first, to ensure that it's always correct even if there's errors when checking e.g. the /XFA or /SigFlags entires, since thefieldObjects
getter depends on it.Simplify a loop in
fieldObjects
, since the object being accessed is aMap
and those have built-in iteration support.Use a higher logging level for errors in the
formInfo
getter, and include the actual error message, since that'd have helped with fixing PR Fix the "should get form info when AcroForm is present" unit-test #12479 a lot quicker.Update the JSDoc comment in
src/display/api.js
to list the return values correctly, and also slightly extend/improve the description.