How to cancel or destroy a getPage request with disableAutoFetch set #11453

arelaxend · 2019-12-27T19:11:56Z

Dear pdf.js contributors,

With disableAutoFetch set, is there a way to cancel fetching on getPage() ? The same way one destroy() getDocument promise

It looks like it is possible but I found only internal functions.

Best, A.

The text was updated successfully, but these errors were encountered:

Snuffleupagus · 2019-12-27T19:22:01Z

With disableAutoFetch set, is there a way to cancel fetching on getPage() ?

Huh, calling getPage is what causes data to be requested (there's no cancelling involved); it's quite frankly difficult to understand what you're trying to ask here.

arelaxend · 2019-12-27T19:39:43Z

it's quite frankly difficult to understand what you're trying to ask here.

Oups. With disableAutoFetch off and disableStreaming off, whenever one calls getDocument starts fetching the entire file. One can cancel the fetching by calling destroy() on the promise, it is going to stop the GET request.

const task = pdfjs.getDocument(...);
...
if (task !== undefined) {
  await task.destroy();
  delete task;
}

With disableAutoFetch set, fetching occurs just after getPage, but there is no destroy() to cancel the GET 206 range request in case one wants to. For example, one requires to cancel some pages currently being fetched because the user moves to other pages before the previous pages were fetched.

Still, it looks like it is possible to cancel the _transport, but if one does that it is going to cancel all future requests.

pdf.js/src/display/api.js

Lines 423 to 424 in c3a1c67

const transportDestroyed = !this._transport ? Promise.resolve() :

this._transport.destroy();

calling getPage is what causes data to be requested (there's no cancelling involved)

Absolutely, the best way is not to call getPage if one should not. Still, this is not the point here 💯 and it is also better not to call getDocument if one should not.

Is the following a workaround ?

pdf.js/src/display/transport_stream.js

Lines 131 to 140 in c3a1c67

cancelAllRequests(reason) {

if (this._fullRequestReader) {

this._fullRequestReader.cancel(reason);

}

const readers = this._rangeReaders.slice(0);

readers.forEach(function(rangeReader) {

rangeReader.cancel(reason);

});

this._pdfDataRangeTransport.abort();

}

Snuffleupagus · 2019-12-27T19:58:31Z

[...] but there is no destroy() to cancel the GET 206 request in case one wants to.

There's no way of doing what you're asking, short of destroying the loadingTask itself (and thus closing the entire document).

For example, one requires to cancel some pages currently being fetched because the user moves to other pages before the previous pages were fetched.

First of all, note that there's a couple of different ways that data could be loaded (using Fetch, XMLHttpRequest, or a PDFDataRangeTransport implementation). Secondly, there's generally speaking nothing that says that different pages wouldn't need data from the same byte range (and aborting a request could thus break other getPage calls).

Hence what you're asking for isn't possible, nor will it be supported either unfortunately (as outlined above, and the use-case seems fairly specialized anyway).

arelaxend · 2019-12-27T20:04:01Z

Secondly, there's generally speaking nothing that says that different pages wouldn't need data from the same byte range (and aborting a request could thus break other getPage calls).

Ok. In my use case, I fetch say page [-1, current, 1] whenever the user moves to a page. If a user moves fast to another current page, I am going to cancelAllRequests().

pdf.js/src/display/transport_stream.js

Lines 131 to 139 in c3a1c67

cancelAllRequests(reason) {

if (this._fullRequestReader) {

this._fullRequestReader.cancel(reason);

}

const readers = this._rangeReaders.slice(0);

readers.forEach(function(rangeReader) {

rangeReader.cancel(reason);

});

this._pdfDataRangeTransport.abort();

Wait until all the requests are cancelled, and fetch the new [-1, current, 1] pages.
My question is: does cancelAllRequests() the best option for such scenario ?

First of all, note that there's a couple of different ways that data could be loaded (using Fetch, XMLHttpRequest, or a PDFDataRangeTransport implementation).

Snuffleupagus · 2019-12-27T20:15:06Z

I am going to cancelAllRequests().

As explained in #11453 (comment) that will easily lead to all kinds of breakage, and isn't something that you should be calling manually (it's being used from WorkerTransport.destroy).

I am currently using PDFDataRangeTransport implementation for range requests.

Please note that the default range request functionality in PDF.js isn't in any way connected with PDFDataRangeTransport, so unless you're using the API along the lines below then you're not actually using PDFDataRangeTransport.

const loadingTask = getDocument({
  range: /* custom PDFDataRangeTransport here */,
  //  more parameters here
});

arelaxend · 2019-12-27T20:23:28Z

OK. I am going to setTimeout(() => getPage(), 200); and clearTimeout() the timeouts in ref. to your first comment

calling getPage is what causes data to be requested (there's no cancelling involved)

What is the purpose of PDFDataRangeTransport ? Extending the range capabilities ? I found no examples or use cases out there

Thank you for all your tips @Snuffleupagus 👍

Snuffleupagus · 2019-12-27T21:16:56Z

What is the purpose of PDFDataRangeTransport ?

It allows completely custom data delivery, that you thus can implement in what ever way you want/need in your case (it's being used in the PDF Viewer that's built-in to the Firefox browser).

While it does allow a great deal of flexibility, it's consequently a fair bit more complex than just providing a URL when calling getDocument :-)

I found no examples or use cases out there

There's the API unit-tests and also the default viewer usages here, here, here and finally here and here.

timvandermeij · 2019-12-27T23:04:52Z

Closing as answered by the comments above.

legistek · 2023-01-17T19:25:04Z

I had a similar issue with very large non-linearized PDFs where getPage was taking a very long time for later pages because most if not the whole PDF had to be downloaded. Specifically if the user closed the document but not the app it would nonetheless keep downloading. For this PDFDocumentLoadingTask.destroy works well, thank you for the advice @Snuffleupagus!

I would quickly point out that it would be nice to be able to cancel an individual page load too though because, of course, page loading can sometimes take WAY longer than rendering. Hunting around the code it seems like during the while loop here it'd be pretty straightforward to add a simple cancellation check on each iteration, using something akin to an optional CancellationToken that could be passed into getPage and getPageDict. Would that not do the trick and would you entertain a PR that did that?

timvandermeij closed this as completed Dec 27, 2019

Snuffleupagus mentioned this issue Feb 24, 2020

Browser caching does not work for the first range request on Chrome and Safari #11624

Closed

Snuffleupagus mentioned this issue Mar 9, 2020

Using Range Requests - Do the first call as a a HEAD call #11669

Closed

Snuffleupagus mentioned this issue Aug 22, 2023

BUG | fixing cache "not modified" response #16859

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to cancel or destroy a getPage request with disableAutoFetch set #11453

How to cancel or destroy a getPage request with disableAutoFetch set #11453

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019 •

edited

Loading

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019 •

edited

Loading

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019 •

edited

Loading

timvandermeij commented Dec 27, 2019

legistek commented Jan 17, 2023

How to cancel or destroy a getPage request with disableAutoFetch set #11453

How to cancel or destroy a getPage request with disableAutoFetch set #11453

Comments

arelaxend commented Dec 27, 2019 • edited Loading

Snuffleupagus commented Dec 27, 2019

arelaxend commented Dec 27, 2019 • edited Loading

Snuffleupagus commented Dec 27, 2019 • edited Loading

arelaxend commented Dec 27, 2019 • edited Loading

Snuffleupagus commented Dec 27, 2019 • edited Loading

arelaxend commented Dec 27, 2019 • edited Loading

Snuffleupagus commented Dec 27, 2019 • edited Loading

timvandermeij commented Dec 27, 2019

legistek commented Jan 17, 2023

arelaxend commented Dec 27, 2019 •

edited

Loading

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019 •

edited

Loading

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019 •

edited

Loading

arelaxend commented Dec 27, 2019 •

edited

Loading

Snuffleupagus commented Dec 27, 2019 •

edited

Loading