-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import pdfjs-dist not working correctly #58313
Comments
Note that importing https://github.com/Luluno01/pdfjs-dist-import-reproducer/blob/main/expected.js |
Looks like it has something to do with webpack. I found a temporary workaround, which is using dynamic |
@Luluno01 I ran into this issue just now, and yeah the imports shows up as undefined for me as well (I'm on version 14) |
@Luluno01 I tried using My code: export async function POST(req: Request, res: Response) { |
@Luluno01 Even though typeof shows that it's an object, an empty array will show up if you try to print |
I even added https://github.com/mozilla/pdfjs-dist manually into my project, same error, something to do with the imports for sure |
I did the same as what you did and got the same result. Then I added a magic comment |
Found this. Not a big fan of webpack but I tried to follow the settings in the example provided. Still no luck. |
I created a brand new node project and everything works, so this is an issue with how next.js/webpack bundle the different modules. |
Interesting thing is I think everything works in the pages router |
You mean so far it ONLY works in pages router? |
Try importing like this: import * as PDFJS from 'pdfjs-dist/build/pdf.min.mjs' |
Interesting, it does make some difference, but results in another error. The result is the same as installing and importing the CommonJS version directly from the repo. While it no longer imports nothing, the library complains:
According to the official example, we should add |
I got the worker error as well, I think 'import {getDocument} from 'pdfjs-dist'' is the official recommended way? Re webpack splitting, I have not the slightest clue lol, never really messed around with it before. Really hate to split this pdf processing into it's own microservice lol |
Interesting. I'm pretty sure it has everything to do with webpack. But I'm not familiar with webpack stuff... Still struggling to figure out how to configure webpack to make it work with app router. |
Yeah, me too. I ended up reimplementing the PDF processing API endpoint with Cloud Functions, which doesn't use a bundler but runs directly the compiled code of TypeScript (or your JS code as-is). Really ugly workaround. |
If I still remember my experiments correctly, |
also tried
Going have to do the same thing, I think the team at Vercel should also look at other libraries with
Also could you link the doc where |
I think it's also important to clarify that |
I just found that Later I inspected the source code: https://github.com/mozilla/pdfjs-dist/blob/master/build/pdf.js#L2031. it is |
Yes, you are right. And my use case is server-side PDF file processing. |
Okay, I managed to get it work by adding an ugly hint for webpack: |
Interesting, I can't get ' |
I'll try your workaround when you add the new branch, in the meantime I'm going to see if it works in |
Just add |
Here you are: Luluno01/pdfjs-dist-import-reproducer@82c4439 |
OMG you are a genius!! I added an API endpoint and it also works:
On my end it does give me a warning about a font issue, not sure if it's an import related issue but I'm getting me results! Thanks a lot! |
Also for future folks who may stumble on this error message when using another package that depends on |
Same with my reproducer. That's why I moved it to |
I installed the exact versions of |
Not yet. But I might have to use it soon LOL |
yeah I'll have to as well, or at least host a nodejs backend on Vercel, there was a change that |
Also keep in mind that cloud functions has a 100MB limit as well, sadly. why can't there be a semi-decent pdf parsing library out there... So frustrating |
I guess it might have something to do with the transient dependency |
Cloud Functions has much relaxed restrictions as claimed here:
|
@Luluno01 I downgraded next to 13.5.6 and at least langchain's PDFLoader is working? I'm guessing they bundle the PDFLoader in a specific way that the other libraries don't? |
I was testing with |
@Luluno01 So deployed my Node/Express backend on Vercel and got this as well: Unhandled Promise Rejection {"errorType":"Runtime.UnhandledPromiseRejection","errorMessage":"Error: Setting up fake worker failed: "Cannot find module '/var/task/node_modules/pdfjs-dist/build/pdf.worker.mjs' imported from /var/task/node_modules/pdfjs-dist/build/pdf.mjs".","reason":{"errorType":"Error","errorMessage":"Setting up fake worker failed: "Cannot find module '/var/task/node_modules/pdfjs-dist/build/pdf.worker.mjs' imported from /var/task/node_modules/pdfjs-dist/build/pdf.mjs".","stack":["Error: Setting up fake worker failed: "Cannot find module '/var/task/node_modules/pdfjs-dist/build/pdf.worker.mjs' imported from /var/task/node_modules/pdfjs-dist/build/pdf.mjs"."," at file:///var/task/node_modules/pdfjs-dist/build/pdf.mjs:3720:36"," at processTicksAndRejections (node:internal/process/task_queues:95:5)"]},"promise":{},"stack":["Runtime.UnhandledPromiseRejection: Error: Setting up fake worker failed: "Cannot find module '/var/task/node_modules/pdfjs-dist/build/pdf.worker.mjs' imported from /var/task/node_modules/pdfjs-dist/build/pdf.mjs"."," at process. (file:///var/runtime/index.mjs:1276:17)"," at process.emit (node:events:526:35)"," at process.emit (/var/task/___vc/__launcher/__sourcemap_support.js:602:21)"," at emit (node:internal/process/promises:150:20)"," at processPromiseRejections (node:internal/process/promises:284:27)"," at processTicksAndRejections (node:internal/process/task_queues:96:32)"]} Works fine and dandy on localhost, think this one is related to ESM though |
I think you might have to import the minified version as |
Yep you are right, that worked for me! Seems like Vercel also have issues finding .wasm files as well: This might be webpack not bundling the |
Very likely. If you have to use tesseract.js on Vercel, another workaround is to bypass Next.js and register a separate folder as your function implementation (you will need to do your own vendoring/bundling/tree-shaking). See vercel.json for more details. |
I decided to follow a simple path, I downloaded the stable version from the official website. I put all the files in the <script src="/pdfjs/pdf.mjs" type="module" /> then adding code in useEffect: const pdfjs = window.pdfjsLib as typeof import('pdfjs-dist/types/src/pdf')
const pdfjsWorker = await import('pdfjs-dist/build/pdf.worker.min.mjs');
pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker;
const pdfDocument = pdfjs.getDocument('http://localhost:3000/pdf-files/myFile.pdf')
console.log('pdfDocument', pdfDocument); |
Hi, there're some bundling fixes are landed on the canary (14.0.5-canary.45) I tested against latest canary it works well now. |
Good to hear that! Could you please elaborate a bit on what the fix is and how it fixes the issue? Will that fix land on 13.x, or how can we cherry-pick that that fix to 13.x? Thanks a lot. |
There're few module resolution related bundling fixes applied after 14.0.4, on canary now. Unfortunately we're not going to apply them back to 13.x. |
Okayyyy... Thank you for your reply. Sounds like I have to upgrade to 14.0.5+ later to be able to use pdfjs with less workaround. |
is there an updated solution for this? facing the same issues: import trace for request module/Release/canvas.node next version 14.0.5 |
No, I don't find a new solution to this. But you can post the full context and error message here or in a new issue since you are using 14.0.5 which they claimed to have the issue fixed. |
my use case is for a file image generator in a hook ` export default function useFileImageGenerator() { function getThumbnail(file: File) {
} const dataURLtoBlob = (dataURL: string) => {
}; return getThumbnail; next 14.0.5 |
@huozhi Any thoughts? |
This closed issue has been automatically locked because it had no new activity for 2 weeks. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you. |
Link to the code that reproduces this issue
https://github.com/Luluno01/pdfjs-dist-import-reproducer
To Reproduce
next dev
)/
)getDocument
beingundefined
.Current vs. Expected behavior
The ESM package
pdfjs-dist
should be imported correctly. The actual outcome, however, is nothing will be imported -- all exported objects areundefined
, including the default export.Verify canary release
Provide environment information
Which area(s) are affected? (Select all that apply)
App Router, TypeScript (plugin, built-in types)
Additional context
Same problem with version "13.5.4" and version "13.0.0".
The text was updated successfully, but these errors were encountered: