Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: DOMException is not a constructor after updating to 3.5.141 #16255

Closed
lsickert opened this issue Apr 6, 2023 · 8 comments · Fixed by #16279
Closed

TypeError: DOMException is not a constructor after updating to 3.5.141 #16255

lsickert opened this issue Apr 6, 2023 · 8 comments · Fixed by #16279

Comments

@lsickert
Copy link

lsickert commented Apr 6, 2023

Attach (recommended) or Link to PDF file here:
https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf
(Happens with any pdf-file)

Configuration:

  • Web browser and its version: Not a web-browser, Node.JS 16.20.0
  • Operating system and its version: macOS Ventura 13.3 (but also happens in Alpine Linux Node-16 container)
  • PDF.js version: 3.5.141 (legacy build because of Node)
  • Is a browser extension: No

Steps to reproduce the problem:

  1. run the following code snippet (function taken from my code-base, input parameter should be the url):
const pdfjslib = require("pdfjs-dist/legacy/build/pdf.js");

module.exports = {
    getPdfText
};

async function getPdfText(input) {
    const loader = await pdfjslib.getDocument(input);
    const doc = await loader.promise;
    const numPages = doc.numPages;

    const docContent = [];

    for (let i = 1; i <= numPages; i++) {
        const page = await doc.getPage(i);
        const content = await page.getTextContent();
        const strings = content.items.map(item => item.str);
        docContent.push(strings.join(" "));
        page.cleanup();
    }
    return docContent;
}
  1. code fails on line const doc = await loader.promise; with the following error:
{
        "message": "DOMException is not a constructor",
        "name": "UnknownErrorException",
        "details": "TypeError: DOMException is not a constructor"
}

What is the expected behavior? (add screenshot)
The pdf should get loaded and the text extracted (this worked as expected in the previous version 3.4.120)

What went wrong?
The pdf cannot be loaded, see above for the error message

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension): N/A

@Snuffleupagus
Copy link
Collaborator

WFM when running your code locally, using Node.js 18.14.2 on Windows 11, against the master branch of the PDF.js project (after using gulp dist-install to build the necessary files).

Does it perhaps work if you update to Node.js version 18 instead? Have you installed all the necessary dependencies?

Please see https://github.com/mozilla/pdf.js/blob/master/.github/CONTRIBUTING.md (emphasis mine):

If you are developing a custom solution, first check the examples at https://github.com/mozilla/pdf.js#learning and search existing issues. If this does not help, please prepare a short well-documented example that demonstrates the problem and make it accessible online on your website, JS Bin, GitHub, etc. before opening a new issue or contacting us in the Matrix room -- keep in mind that just code snippets won't help us troubleshoot the problem.

@lsickert
Copy link
Author

lsickert commented Apr 9, 2023

I did install all necessary requirements as far as I am aware since the code runs without issues in the previous version of pdfjs. I installed it through the pdfjs-dist package and manually had to install canvas + the dependencies specified by it for macOS and Alpine Linux.

The code is adapted from this node example with mainly swapping out the promises for async/await syntax. Further sharing of my code outside of this snippet is unfortunately not really possible since it is a commercial project. But this part of the code is relatively self-contained as it will be a new module accepting either pdf files or (as in the example above) a URL as input and returning post-processed text content.

I have not tested updating to Node 18 yet but will try it out. Currently I am on vacation, so I will only get to it in 2 weeks unfortunately.

@Snuffleupagus
Copy link
Collaborator

Please note that DOMException is not thrown (or used) anywhere within the PDF.js code-base, see https://github.com/search?q=repo%3Amozilla%2Fpdf.js+DOMException&type=code, hence it really seems that it must originate in a polyfill.

Without being able to test your exact code/setup, in the form of a reduced and directly runnable test-case, it's probably going to be very difficult for anyone to help unfortunately.

I have not tested updating to Node 18 yet but will try it out. Currently I am on vacation, so I will only get to it in 2 weeks unfortunately.

Let's close this for the time being, since we try to avoid leaving non-actionable issues open.

@Snuffleupagus Snuffleupagus closed this as not planned Won't fix, can't repro, duplicate, stale Apr 9, 2023
@kriddile
Copy link

I am experiencing the same error after updating to 3.5.141. Downgrading to 3.4.120 removed the error. I am using pdfjs-dist via this library on the official Node.js 16.20.0 Alpine 3.17 Docker image.

@simonhaenisch
Copy link

simonhaenisch commented Apr 10, 2023

On the Dependabot PR simonhaenisch/md-to-pdf#188 with 3.5.141 the tests fail consistently on Node 16 on Windows, Linux and macOS, whereas they pass on all platforms with Node 18.

@Snuffleupagus I checked the changelog and there's a note about setting Node 16 as minimum version which was from your PR #16123 but clearly it's not fully compatible with Node 16.

Your note about your code base not using DOMException in the code base doesn't mean this is not actionable... it could be one of your direct dependencies or an update of those that's causing this.

Please feel free to use the relatively simple test cases for reproduction with Node 16: https://github.com/simonhaenisch/md-to-pdf/blob/master/src/test/api.spec.ts#L82-L112 (implementation of getPdfTextContent is just a few lines in L8-17).

Please reopen the issue 🙏

@Zirafnik
Copy link

I experienced the same problem.

I just wanted to do this: https://github.com/mozilla/pdf.js/blob/master/examples/node/getinfo.js

Node: v16.10.0
OS: WSL2

Downgrading to 3.4.120 worked.

@simonhaenisch
Copy link

@Zirafnik not sure you saw but it already got fixed so you just need to wait for the next version.

@Zirafnik
Copy link

@simonhaenisch Ah, my bad, I definitely did not notice that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants