Don't write an empty PDF file when downloading of a linked test file fails #7947

Snuffleupagus · 2017-01-11T13:30:43Z

Currently test/downloadutils.js will write out an empty file when downloading of a linked file fails. The consequence of this is that since the PDF file now "exists", no further attempts to download it will be made unless the empty PDF is removed.
This is especially annoying when it happens on the bots, since it requires that someone logs in and manually removes the empty PDF file.

I'm really not sure why it was implemented this way to begin with. However, given that PDFJS.getDocument fails with MissingPDFException if a file is not present, I cannot see a problem with this change.
Furthermore, the patch also changes the checkRefTestResults in test/test.js such that we treat the "downloading of linked test file failed" case as a failure, so that e.g. the bots won't report "success" when a test file failed to download.

…fails Currently `test/downloadutils.js` will write out an empty file when downloading of a linked file fails. The consequence of this is that since the PDF file now "exists", no further attempts to download it will be made unless the empty PDF is removed. This is especially annoying when it happens on the bots, since it requires that someone logs in and manually removes the empty PDF file. I'm really not sure why it was implemented this way to begin with. However, given that `PDFJS.getDocument` fails with `MissingPDFException` if a file is not present, I cannot see a problem with this change. Furthermore, the patch also changes the `checkRefTestResults` in `test/test.js` such that we treat the "downloading of linked test file failed" case as a failure, so that e.g. the bots won't report "success" when a test file failed to download.

yurydelendik · 2017-01-11T14:01:22Z

Given a nature of linked files, saving empty and error file was done to allow beginner contributor to at least continue with some subset of test. I would like to preserve this behavior in someor even relaxed form. We don't want to discourage people to use testing. For the server we can enforce failures for non-downloaded files.

yurydelendik · 2017-01-11T14:05:39Z

This is especially annoying when it happens on the bots, since it requires that someone logs in and manually removes the empty PDF file.

We need to disable these line for 'test'/'unittest' and keep it only for 'makeref'.

https://github.com/mozilla/botio-files-pdfjs/blob/master/on_cmd_test.js#L80-L86

Snuffleupagus · 2017-01-11T14:23:06Z

Given a nature of linked files, saving empty and error file was done to allow beginner contributor to at least continue with some subset of test. I would like to preserve this behavior in someor even relaxed form. We don't want to discourage people to use testing. For the server we can enforce failures for non-downloaded files.

I can certainly revert the changes to test/test.js, but I still don't see the point of writing an empty file.
Am I missing obvious something here?

Consider the way that things currently work, when one of these empty files are loaded it will trigger this assert: https://github.com/mozilla/pdf.js/blob/master/src/core/document.js#L385; and the loadingTask will thus be rejected with UnknownErrorException.
With this patch, the loadingTask will instead be rejected with MissingPDFException.

So, without the change to test/test.js, we'd still get similar behavior here but with the added bonus of not being perpetually stuck with an empty test file (this ought to be helpful not just on the bots, but for people running tests locally as well)!

yurydelendik · 2017-01-11T14:34:12Z

I can certainly revert the changes to test/test.js, but I still don't see the point of writing an empty file.
Am I missing obvious something here?

In the past (test.py) it was done to suppress further attempts to download with subsequent test.py run, e.g. if you had multiple pdfs files failed to download, you could get several minutes of retry until testing will run.

yurydelendik · 2017-01-11T14:38:55Z

Can we split download of pdf corpus into separate task? If pdf were not downloaded it will just skip them with warning during testing. Download task will never create an error/empty file, and every time it will try to download missing pdfs and erroring if at least one fails.

Snuffleupagus · 2017-01-11T15:34:26Z

Can we split download of pdf corpus into separate task? If pdf were not downloaded it will just skip them with warning during testing. Download task will never create an error/empty file, and every time it will try to download missing pdfs and erroring if at least one fails.

Yes, that definitely sounds like the best solution here!
Since this is something that I'm not really affected by myself, I probably won't work on it though (at least not for the foreseeable future, since my time is a bit limited).

Snuffleupagus added the test label Jan 11, 2017

Snuffleupagus closed this Jan 11, 2017

Snuffleupagus deleted the test-file-download-fail branch January 11, 2017 15:34

Snuffleupagus mentioned this pull request Feb 8, 2017

Strict mode for testing on bots #8042

Closed

Snuffleupagus mentioned this pull request Sep 17, 2017

All Internet Archive links, used in reference testing, are broken #8920

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't write an empty PDF file when downloading of a linked test file fails #7947

Don't write an empty PDF file when downloading of a linked test file fails #7947

Snuffleupagus commented Jan 11, 2017

yurydelendik commented Jan 11, 2017

yurydelendik commented Jan 11, 2017

Snuffleupagus commented Jan 11, 2017 •

edited

Loading

yurydelendik commented Jan 11, 2017

yurydelendik commented Jan 11, 2017 •

edited

Loading

Snuffleupagus commented Jan 11, 2017

Don't write an empty PDF file when downloading of a linked test file fails #7947

Don't write an empty PDF file when downloading of a linked test file fails #7947

Conversation

Snuffleupagus commented Jan 11, 2017

yurydelendik commented Jan 11, 2017

yurydelendik commented Jan 11, 2017

Snuffleupagus commented Jan 11, 2017 • edited Loading

yurydelendik commented Jan 11, 2017

yurydelendik commented Jan 11, 2017 • edited Loading

Snuffleupagus commented Jan 11, 2017

Snuffleupagus commented Jan 11, 2017 •

edited

Loading

yurydelendik commented Jan 11, 2017 •

edited

Loading