Make Firefox extension support PDF navigation via included PDF viewer #13

cipri-tom · 2022-12-29T16:16:24Z

Hi,

As discussed in #4 , we cannot navigate in the PDF.
This is due to the embedded PDF not allowing history:
https://github.com/mozilla/pdf.js/blob/c791e01bfc280fb6abc476dece21c6a88d2340df/web/app.js#L525-L527

They do this on purpose: mozilla/pdf.js#8121

So the only solution is to have our own PDF viewer, and hope that some day Firefox will enable either content_scripts on PDF, or a WebExtension PDF that serves as API, which we could extend.

In this PR I add the original pdf.js repo, and link it to the extension code. Here's how it looks:

arxiv-utils-mod-10fps50.mp4

(this video does not play on firefox 😕 the irony... )

Maintainance

We will need to keep an eye out for new releases of PDF.js and merge them into the repo. I believe this shouldn't be too difficult: pull the newer tag from mozilla, and merge it into arxiv-utils-customisation branch ; then run gulp generic. We can do it by hand the first couple of times, and if it works well, we can add a GitHub action to do it automatically.

Next steps

A part I'm not very happy about is the linking from the dist all the way up to the root project. Maybe we can separate them better?

Finally, one part that I am totally missing is how to build the extension. I think we need to zip certain files, right ? If you have a doc about this, I can add the building scripts (zipping, etc).

What do you think ?

j3soon

Thanks for working on this feature.

If I understand correctly, the default pdf.js deliberately disables history, therefore we cannot use its CDN, and must include our modified pdf.js in arxiv-utils to re-enable history.

I think an additional fork of pdf.js is unnecessary and may be difficult to maintain. We can simplify this by including the official pdf.js repo as a submodule, and store our modifications in a git patch (i.e., diff file). When building the extension, we can run git apply before running gulp generic.

If you are using Linux, the extension is build by simply zipping the entire directory:

zip -r arxiv-utils-firefox.zip ./firefox

The official method for automated build seems to be using https://github.com/mozilla/web-ext, but I haven't look into it yet.

A few additional questions:

I cannot view the video you provided (even on Chrome). Maybe the video failed to upload?

I'm not quite sure what you mean by:

linking from the dist all the way up to the root project.

could you elaborate more on this?

Since we may not update pdf.js timely, I'm slightly concerned about the potential security threat of this approach. Maybe a better way to include this feature is to add an option in the settings page, and require users to explicitly enable the enhanced PDF viewer?

cipri-tom · 2022-12-29T22:58:45Z

Thank you for the timely and in-depth review !

Here are my answers.

the default pdf.js deliberately disables history

when embedded in an iframe, yes. If not embedded, history is there.

therefore we cannot use its CDN

The reason we cannot use the CDN is different: mozilla's "vanilla" implementation does not allow PDFs from foreign origin. Their reasoning is here: mozilla/pdf.js#6916

So we have to modify their app.js: cipri-tom/pdf.js@83fa62f

If you load official dist, there is no easy option to overcome this limitation.

store our modifications in a git patch. When building the extension, we can run git apply

I like this approach ! I also wasn't too happy with modifying their code directly. Nor with having two repositories to maintain. I'll try your suggestion to see how this works.

OK if I add a Makefile to apply the patch, build the dist, and zip the files ? I'll try to not zip the whole pdf.js submodule, just the dist build, even minified.

cannot view the video

Probably due to the encoding, screen recorded on Mac. Just wanted to show what it looks like. Here it is

linking from the dist all the way up to the root project.

Of course you didn't see what I meant, since I forgot to push that change 🤦‍♂️🙈. Sorry.
I meant this: cipri-tom/pdf.js@4ab0f28#diff-4b300cea2df7f3f32d97ec5bd2e71eee41932f0fb52f31516f91dbff0c683409R481 (last modif)

But in addition, also forward linking to the viewer, this page is very deep, and the whole path will show in the address bar.
98c4fae#diff-ec159a73555541cc6f9e5ae23aa3c77adad2a8c0859efcecafc8c945492f4412

A possible solution would be to move the dist folder above pdf.js. Probably a setting or modif in their gulp building process. Or, to interfere as little as possible with their code, to flatten the files a bit in the proposed Makefile that patches, builds (gulp), then moves files, then zip.

P.S. I will attempt these changes tomorrow, or next year ! :) ✨ Happy New Year ! ✨

j3soon · 2022-12-30T13:15:33Z

Thank you for the detailed reply. I can see the GIF on Windows now.

I think writing a Makefile or shell script, and moving the dist folder after build sounds good. This method should work directly on Linux and Mac, and may work on Windows after intalling Cygwin/bash.

Happy New Year to you too. :)

Since we flatten the structure a bit during zipping, we update the corresponding files to point to the correct scripts

cipri-tom · 2023-01-04T13:46:34Z

Hi @j3soon

I've made the updating script. I have tried to follow best practices, and left comments and detailed status messages.

What do you think ?

It seems to work rather well. I was afraid that the patches would not apply well from one release to the next, but initially pdf.js was on 3.1.81, and now it's on v3.2.146, and it works well with both. So I think this method, with the patches, is quite future-proof, since our changes are to stable parts of the code.

However, just in case we need to make new changes, I've added the instructions on how to generate the patch again. Everything resides in update.sh.

The only thing that I'm not content about is that we can no longer load the extension based on the files directly, instead we have to build it after each modification and use the zip file. This is because we now extract the dist folder, as discussed. So the path to the title-managing code is not accurate when the files are not zipped.

I don't expect this to be a big nuissance, since we will probably have few updates to do, besides upgrading pdf.js. However, if the need arises, there are 2 solutions:

use the src="../../../../pdfviewer.js" path. Surprisingly, it works, even though the ../../../ is way above the location of pdfviewer.js. This is probably because the browser normalizes the path and doesn't let you go above the root of the extension, and pdfviwer.js resides in the root
copy the dist contents to the root directory before zipping. This way we can load the extension from disk, but we need to remember to run ./update.sh after the changes

Anyhow, we can decide between 1/2 later on, in case we really need to do serious development for the extension. For now, it is stable

cipri-tom · 2023-01-16T09:09:33Z

@j3soon if you have some time to have a look this week, it would be great to advance / close this :). Then I'll be away for ~ 1 month.

Thank you !

j3soon · 2023-01-16T15:26:31Z

I have been busy lately, I'll review this when I'm available. Thanks.

j3soon · 2023-02-02T13:41:36Z

@cipri-tom This newer version looks much better. However, it seems like we still need to update the pdf.js version once in a while...

In another related repo, you proposed to re-use the official pdf.js page to achieve the same result (j3soon/firefox-enable-pdf-addons#1), which simply redirect to https://mozilla.github.io/pdf.js/web/viewer.html and utilize the ?file=... parameter.

This approach allows us to always use the latest pdf.js.

I wonder if this approach has any drawbacks?

cipri-tom · 2023-02-15T10:51:34Z

Hi @j3soon !

I'm not sure Mozilla would be very happy if all PDFs are using their website. Generates traffic for which they pay, I assume. I like Mozilla and wouldn't want to do this without checking with them it's OK.

But maybe this way they'd take a decision sooner to fix these limitations around PDFs.

I do not think the updating of the PDF is so difficult. Until now it worked butter smooth. Maybe we can even set up a GitHub action so it is done automatically ?

j3soon · 2023-02-15T13:10:56Z

@cipri-tom The viewer provided by Mozilla is hosted on GitHub pages, so the traffic is covered by GitHub, not by Mozilla. The potential concern is hitting the bandwidth limit of GitHub pages, but I doubt that the current ~250 active Firefox users can read that much paper per month.

I prefer to keep things simple for now. We can add a experimental checkbox option in the settings page of arxiv-utils, if the checkbox is checked, all arxiv PDFs will be redirected to https://mozilla.github.io/pdf.js/web/viewer.html.

If this method works well, we may set this behavior as default, and keep an eye on the availability of the PDF viewer on GitHub pages. We may revisit your current implementation (1ffca94) in the future, in case the PDF viewer has other issues or being removed.

j3soon · 2023-02-15T13:22:00Z

@cipri-tom Regarding GitHub Actions: If you have experiences in using GitHub actions to automatically build/publish the extension, it would be very helpful if you can provide some guidance in #14.

Maybe we can work on these feature incrementally:

Add an experimental feature that redirects to https://mozilla.github.io/pdf.js/web/viewer.html.
Set up automated build & publish in Automated build and publish #14
Reconsider whether to switch to our own pdf.js viewer (1ffca94)

and thanks again for the in-depth investigation of this issue!

cipri-tom · 2023-02-15T14:14:31Z

Hi @j3soon !
This seems like a good plan ! Thanks for the thoughtful direction. I will start working on the redirection in the next days, when I get a bit of time.

I will close this for now, and as you say in point 3., we can reconsider later on. Just for reference, the correct commit would be: 0bda6bc because it uses the mozilla's repo, not a duplicate in my account.

Thank you for following through the investigation, it was fun and good learning for me ! :)

Fixes: #4 Related: #13 Co-authored-by: Ciprian Tomoiaga <[email protected]>

cipri-tom added 3 commits December 29, 2022 15:56

added pdf.js submodule

1ffca94

make extension use our own js viewer

98c4fae

update our title script to work with the new viewer, linking both

55f477c

j3soon self-requested a review December 29, 2022 18:01

j3soon requested changes Dec 29, 2022

View reviewed changes

cipri-tom added 6 commits December 30, 2022 18:19

added script for updating code and building from patches

fb3a525

Refactor update.sh so that each function controls its working directory

8de6f98

add zipping and cleaning to the update script

90bf5e8

Since we flatten the structure a bit during zipping, we update the corresponding files to point to the correct scripts

Added instructions for future development of viewer

abb1b85

Updated pdf.js to v3.2.146

b39deb2

switch submodule back to mozilla's repo

0bda6bc

cipri-tom closed this Feb 15, 2023

j3soon added a commit that referenced this pull request Apr 21, 2023

Add experimental config for external PDF viewer

5a624f6

Fixes: #4 Related: #13 Co-authored-by: Ciprian Tomoiaga <[email protected]>

j3soon mentioned this pull request Apr 25, 2023

Issues in Firefox PDF container #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Firefox extension support PDF navigation via included PDF viewer #13

Make Firefox extension support PDF navigation via included PDF viewer #13

cipri-tom commented Dec 29, 2022

j3soon left a comment

cipri-tom commented Dec 29, 2022 •

edited

Loading

j3soon commented Dec 30, 2022

cipri-tom commented Jan 4, 2023

cipri-tom commented Jan 16, 2023

j3soon commented Jan 16, 2023

j3soon commented Feb 2, 2023

cipri-tom commented Feb 15, 2023

j3soon commented Feb 15, 2023

j3soon commented Feb 15, 2023

cipri-tom commented Feb 15, 2023

Make Firefox extension support PDF navigation via included PDF viewer #13

Make Firefox extension support PDF navigation via included PDF viewer #13

Conversation

cipri-tom commented Dec 29, 2022

Maintainance

Next steps

j3soon left a comment

Choose a reason for hiding this comment

cipri-tom commented Dec 29, 2022 • edited Loading

j3soon commented Dec 30, 2022

cipri-tom commented Jan 4, 2023

cipri-tom commented Jan 16, 2023

j3soon commented Jan 16, 2023

j3soon commented Feb 2, 2023

cipri-tom commented Feb 15, 2023

j3soon commented Feb 15, 2023

j3soon commented Feb 15, 2023

cipri-tom commented Feb 15, 2023

cipri-tom commented Dec 29, 2022 •

edited

Loading