-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[api-major] Remove the SINGLE_FILE
build target and the PDFJS.disableWorker
option
#9385
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch looks really good! Aside from the two minor comments below, I think that we should indeed not advertise disabling workers in the examples.
web/app.js
Outdated
}; | ||
(document.getElementsByTagName('head')[0] || document.body). | ||
appendChild(script); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this bracket is misindented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, too much copy-pasting of existing code here (and above as well); thank you!
web/app.js
Outdated
script.onload = function() { | ||
resolve(); | ||
}; | ||
script.onerror = function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove space after function
to be consistent with the one above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question, how am I supposed to test PDF.js without workers in the Chrome extension build target?
web/app.js
Outdated
script.onerror = function() { | ||
reject(new Error(`Cannot load fake worker at: ${script.src}`)); | ||
}; | ||
(document.getElementsByTagName('head')[0] || document.body). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to document.head || document.body
. Only use document.head
is only undefined in IE8-, and supported by all other browsers: https://developer.mozilla.org/en-US/docs/Web/API/Document/head
I usually do document.head || document.documentElement
(since <head>
should exist; if it does not, then document.body
may also be empty, but the root element should always exist, so that's a safe fallback).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just copying the existing pattern used in https://github.com/mozilla/pdf.js/blob/master/web/app.js#L1515-L1534; should we simply change both them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do that in a separate commit (to keep the refactoring separate from the actual feature).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually do
document.head || document.documentElement
This suggestion has now been implemented (only in the newly added code); thank you!
web/chromecom.js
Outdated
@@ -65,7 +64,9 @@ let ChromeCom = { | |||
file = file.replace(/^drive:/i, | |||
'filesystem:' + location.origin + '/external/'); | |||
|
|||
if (/^filesystem:/.test(file) && !PDFJS.disableWorker) { | |||
const disableWorker = (window.pdfjsDistBuildPdfWorker && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under which circumstances is window.pdfjsDistBuildPdfWorker
unset? When the worker script is loaded in the main thread, right? Is it possible for the code to reach this place before the worker script has had a chance to initiate?
In either case, please add a brief comment here to explain what this line is doing..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under which circumstances is window.pdfjsDistBuildPdfWorker unset? When the worker script is loaded in the main thread, right?
It's unset by default, i.e. when workers are being used. Hence it should only ever be defined when the pdf.worker.js
file is loaded on the main-thread.
Is it possible for the code to reach this place before the worker script has had a chance to initiate?
That shouldn't happen, since the resolvePDFFile
function is being called from
Lines 353 to 355 in fe5102a
ChromeExternalServices.initPassiveLoading = function(callbacks) { | |
let { appConfig, overlayManager, } = PDFViewerApplication; | |
ChromeCom.resolvePDFFile(appConfig.defaultUrl, overlayManager, |
which in turn is called from
Lines 591 to 596 in fe5102a
initPassiveLoading() { | |
if (typeof PDFJSDev === 'undefined' || | |
!PDFJSDev.test('FIREFOX || MOZCENTRAL || CHROME')) { | |
throw new Error('Not implemented: initPassiveLoading'); | |
} | |
this.externalServices.initPassiveLoading({ |
which is called from
Lines 1652 to 1654 in fe5102a
webViewerOpenFileViaURL = function webViewerOpenFileViaURL(file) { | |
PDFViewerApplication.setTitleUsingUrl(file); | |
PDFViewerApplication.initPassiveLoading(); |
webViewerOpenFileViaURL
is then called fromLine 1536 in fe5102a
function webViewerInitialized() { |
which is only after the viewer initialization code has run, see
Lines 502 to 504 in fe5102a
run(config) { | |
this.initialize(config).then(webViewerInitialized); | |
}, |
Finally, note that we always wait on the hash parameters (among other things) to be initialized in PDFViewerApplication.initialize
Lines 166 to 172 in fe5102a
return this._readPreferences().then(() => { | |
return this._parseHashParameters(); | |
}).then(() => { | |
return this._initializeL10n(); | |
}).then(() => { | |
return this._initializeViewerComponents(); | |
}).then(() => { |
In either case, please add a brief comment here to explain what this line is doing..
Sure, will do.
Edit: Actually, do we still need this code-path?
This code seem to have been originally added in PR #4598, but the referenced upstream bug http://crbug.com/362061 seem to have been fixed close to 4 years ago.
I'd thus assume that the affected versions of Chrome are no longer supported since some time, so can we (in a separate PR) remove this code-path and thus not have to worry about it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to not break backcompat unless necessary. I should take a look at my telemetry (https://github.com/Rob--W/pdfjs-telemetry) to see whether it is viable to rip out code that supports old Chrome versions (there is lots of other code besides this line).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment has now been added here.
I don't know exactly how you're currently testing this configuration, so the following might be too simple, but wouldn't it be possible to do something like this: |
I used to use the |
@yurydelendik Since this PR attempts to implement the idea you proposed at https://mozilla.logbot.info/pdfjs/20180102#c14068811-c14068833, do you have time to review (or provide feedback) here? |
I'm an author of React-PDF and I'm concerned about this change. How does it impact users using Webpack, who don't want to use worker? Previously, I just set up disableWorker=true option and pdf.js requested pdf.worker.js via require() and created a fake worker. How that procedure will change? Will I need to require the worker file myself to disable worker? |
@wojtekmaj Your current use case of I guess that it's technically possible to serialize the worker script and then use You should keep the PDF.js worker in a separate file, and assign the location of it to Lines 140 to 148 in b6c57d9
The logic of worker script resolution is here: Lines 41 to 45 in b6c57d9
Lines 1205 to 1218 in b6c57d9
|
Thanks for information @Rob--W. I actually realize the benefits of service worker here and I discourage my users from using a version without worker, but I do provide them with such ability as for many beginners setting it up is an huge problem. I did struggle a lot with this myself. Also, sometimes bundle size matters. Currently how PDF.js plays with Webpack is a mess. Sadly, unless provided with additional configuration (which I failed to figure out), Webpack includes pdf.worker.js code twice because of using two separate loaders on them - it's required via worker-loader in webpack.js entry file, and then again without using worker-loader in pdf.js itself. Because of using separate loaders Webpack is not smart enough to dedupe them. |
The code changes LGTM. The code path for fake workers seems to not be covered by unit tests. Wouldn't it make sense to add a single test that customized |
Given that workers are always disabled in Node.js, we should have implicit fake worker test-coverage thanks to the (subset of) API unit-tests running on Travis: Lines 54 to 55 in fd242ad
Slightly off topic for this PR, but we should see if some of the currently pending API tests could be enabled on Node.js/Travis since we now have node_stream.js available.
While we might be able to add a Node.js/Travis-only explicit test for that, I'm not sure that we'd be able to easily test this when the unit-tests run normally (i.e. in browser) without risk of negatively impacting subsequent tests. Lines 1436 to 1439 in fd242ad
and Lines 1323 to 1325 in fd242ad
Given the above, I'm slightly on the fence about trying to add a fake worker test here! Opinions? |
I am referring to the injected script tag in your new
How about a custom parameter to disable workers in the unit test? It doesn't even need to be a PDF.js API, it could also be something that does |
Sorry, but I really don't understand how this would help. As mentioned above, as soon as |
When I wrote that, I had in mind that it would not be run by default until someone manually edits the URL and adds something like One last thing: The |
Please note that this build target, and the resulting `build/pdf.combined.js` file, is equivalent to setting the `PDFJS.disableWorker` option to `true` which is a performance footgun.
That's a very good catch, thanks for spotting this! |
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/d51f666d03385e3/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/d51f666d03385e3/output.txt Total script time: 8.15 mins Published |
src/display/api.js
Outdated
@@ -1491,6 +1502,14 @@ var PDFWorker = (function PDFWorkerClosure() { | |||
return new PDFWorker(null, port); | |||
}; | |||
|
|||
PDFWorker.getWorkerSrc = function() { | |||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why try-catch? getWorkerSrc
is not expected to throw (unless there is no way to determine the worker source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're correct that it shouldn't normally throw, that could still happen given this code.
Hence it seemed nice to try and avoid surprises for an API consumer that use this function, not expecting an Error
to be thrown. Are you saying that you think this's a completely unwarranted concern, and that we should just return the internal getWorkerSrc
value as-is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suggest to not catch the error. workerSrc
doesn't make sense in Node.js, and when it does, we have to provide a suitable implementation instead of silently returning nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the requested changes; thanks for the review comments!
test/unit/api_spec.js
Outdated
let workerSrc = PDFWorker.getWorkerSrc(); | ||
expect(typeof workerSrc).toEqual('string'); | ||
expect(workerSrc).toEqual(PDFJS.workerSrc); | ||
done(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the test is synchronous, remove done();
and done
in the function(done){
. done
is only needed for async tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :-)
…ns the current `workerSrc` This method returns the currently used `workerSrc`, which thus allows obtaining the fallback `workerSrc` value (e.g. when the option wasn't set by the user).
Despite this patch removing the `disableWorker` option itself, please note that we'll still fallback to loading the worker file(s) on the main-thread when running in environments without proper Web Worker support. Furthermore it's still possible, even with this patch, to force the use of fake workers by manually loading the necessary file using a `<script>` tag on the main-thread.[1] That way, the functionality of the now removed `SINGLE_FILE` build target and the resulting `build/pdf.combined.js` file can still be achieved simply by adding e.g. `<script src="build/pdf.worker.js"></script>` to the HTML (obviously with the path adjusted as needed). Finally note that the `disableWorker` option is a performance footgun, and unfortunately many existing third-party examples actually use it without providing any sort of warning/justification. --- [1] This approach is used in the default viewer, since certain kind of debugging may be easier if the code is running directly on the main-thread.
…tupFakeWorkerGlobal()` function in the `src/display/api.js` file
@yurydelendik Ping; since this PR implements your idea from https://mozilla.logbot.info/pdfjs/20180102#c14068811-c14068833, did you want to look at it before we land this? |
@Snuffleupagus This PR was released on |
Regarding #9385 (comment): We're currently working towards the release of PDF.js version Since we wanted to be able to refactor/cleanup old code, the major version number was thus increased to signal that there's changes that are by design not backwards compatibility. In closing, the problem seem to be twofold: Finally, the actual breaking change in |
@Snuffleupagus I don't understand. The breaking change ( |
I think the misunderstanding is that usually on GitHub if something is pre-release it's marked as so, e.g. 2.0.305-beta or something like that. Mozilla just throws pdfjs-dist on npm "as is" with every merge, without e.g. releasing using a tag "next" or similar. So, if you take pdfjs-dist@latest you are actually getting the beta without knowing about it. |
That can probably be attributed to pure luck, given how you're currently using/integrating PDF.js in your application, rather than anything else. |
@wojtekmaj Your analysis is correct, we should improve the documentation. I filed #9440 for that. |
[api-major] Remove the `SINGLE_FILE` build target and the `PDFJS.disableWorker` option
Please refer to the commit messages for additional details.
This PR is an initial attempt at implementing https://mozilla.logbot.info/pdfjs/20180102#c14068811-c14068833; feedback welcome!