Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support, in Dict.merge, for merging of "sub"-dictionaries #12296

Merged
merged 1 commit into from
Aug 31, 2020

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Aug 27, 2020

This allows for merging of dictionaries one level deeper than previously. This could be useful e.g. for /Resources dictionaries, where you want to e.g. merge their respective /Font dictionaries (and other) together rather than picking just the first one.

@Snuffleupagus Snuffleupagus force-pushed the mergeSubDicts branch 2 times, most recently from df5a131 to 66e7e70 Compare August 28, 2020 07:31
params.acroForm.get("DR") ||
Dict.empty;

this.fieldResources = Dict.merge({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'm fine with this change but when saving a form, we copy the resources dictionary:
https://github.com/mozilla/pdf.js/blob/master/src/core/annotation.js#L1040
So at the end it'll increase the file size uselessly, especially if the pdf is well formated.
That lets me think that we don't need to add fieldResources when this one is coming from Acroform...
To avoid that, my idea, when fixing #12294, was to have a kind of lazy merge:
just put in the dict what we really need.

Copy link
Collaborator Author

@Snuffleupagus Snuffleupagus Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at the end it'll increase the file size uselessly, especially if the pdf is well formated.

Given that the resources are usually Refs, that should hopefully reduce the size-impact in general; however I do agree that avoiding this if at all possible would be preferable.

That lets me think that we don't need to add fieldResources when this one is coming from Acroform...

Assuming that you can detect this case easily, and implement it without mutating any existing Dicts, that could perhaps work (although I'm not entirely sure I understand exactly what you have in mind). Please explore this idea further!

(If you do need to merge Dicts, with sub-resources, this patch provides a correct way of doing so.)

To avoid that, my idea, when fixing #12294, was to have a kind of lazy merge:
just put in the dict what we really need.

The problem was with how it was implemented, since you unfortunately added special-cases (and dare I say hacks) to the general Dict abstraction used throughout the code-base to satisfy just one very specific use-case.
Hence my objections, since that wasn't really maintainable and also would have meant mutating existing Dicts in a way that you absolutely shouldn't do.


I'll remove the src/core/annotation.js changes from this patch, but I think that the better Dict.merge functionality is probably generally useful to have regardless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at the end it'll increase the file size uselessly, especially if the pdf is well formated.

Given that the resources are usually Refs, that should hopefully reduce the size-impact in general; however I do agree that avoiding this if at all possible would be preferable.

That lets me think that we don't need to add fieldResources when this one is coming from Acroform...

Assuming that you can detect this case easily, and implement it without mutating any existing Dicts, that could perhaps work (although I'm not entire sure I understand exactly what you have in mind). Please explore this idea further!
I'll file a bug for that: my idea is to reduce the file size when saving.

(If you do need to merge Dicts, with sub-resources, this patch provides a correct way of doing so.)

To avoid that, my idea, when fixing #12294, was to have a kind of lazy merge:
just put in the dict what we really need.

The problem was with how it was implemented, since you unfortunately added special-cases (and dare I say hacks) to the general Dict abstraction used throughout the code-base to satisfy just one very specific use-case.
Hence my objections, since that wasn't really maintainable and also would have meant mutating existing Dicts in a way that you absolutely shouldn't do.

Agreed !
Maybe we could create an object LazyMergedDict (extends Dict) which will keep a list of dicts. When we make a get, we have a look in each dict until we find it and then we add the key+value in the dict.
Wdyt ? do you have other ideas ?

I'll remove the src/core/annotation.js changes from this patch, but I think that the better Dict.merge functionality is probably generally useful to have regardless.

@Snuffleupagus Snuffleupagus force-pushed the mergeSubDicts branch 3 times, most recently from 03e193e to acfb420 Compare August 28, 2020 08:43
@Snuffleupagus
Copy link
Collaborator Author

/botio unittest

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/cc7a46f67a5fa05/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/a24a0da02b7b5e0/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/a24a0da02b7b5e0/output.txt

Total script time: 3.83 mins

  • Unit Tests: FAILED

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/cc7a46f67a5fa05/output.txt

Total script time: 4.93 mins

  • Unit Tests: Passed

@Snuffleupagus Snuffleupagus marked this pull request as ready for review August 28, 2020 08:52
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/9bffb5c01374a63/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/1c25c3098e559f2/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/9bffb5c01374a63/output.txt

Total script time: 26.97 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/9bffb5c01374a63/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/1c25c3098e559f2/output.txt

Total script time: 29.24 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.215.176.217:8877/1c25c3098e559f2/reftest-analyzer.html#web=eq.log

src/shared/compatibility.js Outdated Show resolved Hide resolved
src/shared/compatibility.js Outdated Show resolved Hide resolved
@mozilla mozilla deleted a comment Aug 30, 2020
This allows for merging of dictionaries one level deeper than previously. This could be useful e.g. for /Resources dictionaries, where you want to e.g. merge their respective /Font dictionaries (and other) together rather than picking just the first one.
@timvandermeij timvandermeij merged commit 60ffac0 into mozilla:master Aug 31, 2020
@timvandermeij
Copy link
Contributor

Thank you for implementing support for this!

@Snuffleupagus Snuffleupagus deleted the mergeSubDicts branch September 1, 2020 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants