Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webpack production bundling react-pdf is super slow #93

Closed
mvirbicianskas opened this issue Oct 20, 2017 · 40 comments · Fixed by #756
Closed

Webpack production bundling react-pdf is super slow #93

mvirbicianskas opened this issue Oct 20, 2017 · 40 comments · Fixed by #756
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@mvirbicianskas
Copy link

Hello,

Am I the only one experiencing slow build times in production mode? it takes Webpack like 180seconds to build/minify pdf+worker js files, which are probably a peer dependency?

Have anyone found a working solution to ship it to production without too long bundling times?

Cheers

@wojtekmaj wojtekmaj added help wanted Extra attention is needed question Further information is requested labels Oct 22, 2017
@wojtekmaj
Copy link
Owner

Hey!
Nope, you're not the only one. PDF.js is a really big library (around 2 MB before, 800 KB after minification) and it simply takes a long time to parse all of that.
It might help a little to use flags like cacheDirectory: true, but I wouldn't expect a big improvement.

If anyone has any ideas on how to improve this situation though, I'd be happy to hear!

@mvirbicianskas
Copy link
Author

well, this is sad actually :( I've been experimenting with solutions but so far no luck... I've been trying to bundle this lib into separate file and and bundle it only once, ofc it's not too maintainable approach, but cutting down 180seconds would be worth it, as this library doesn't update too often, it would suffice to check version diff now and then. Will get back to you if I come up with better solution.
And cacheDirectory is not an option in my case, every deployment we do is clean project initialisation from scratch.

@nnnikolay
Copy link

@mvirbicianskas try to use parcel-bundler, it's not ideal, but it's blazingly fast!

@mvirbicianskas
Copy link
Author

hey, thanks for the recommendation, will take a look, but I don't think it's webpack's problem it's uglifyjs plugin speed problem

@wojtekmaj
Copy link
Owner

Either way, parcel is using uglify-es, much much newer solution than Webpack's. It may improve build speeds.

Please be aware though that I don't officially support Parcel just yet, but I'm definitely on it; super excited about Parcel as much as the rest of the community is!

@obahareth
Copy link

In our case it's taking more than 15 minutes after adding react-pdf, we're using Rails with Webpacker, so I don't think using Parcel is an option for us. Does anybody have any other suggestions? I went into configuration hell trying to get Happypack and AutoDllPlugin working without success. Is there any other pure Webpack-based solution that would work for this? This one issue is making us have to reconfigure different parts of our stack and our deployment flow.

@wojtekmaj
Copy link
Owner

wojtekmaj commented Jan 24, 2018

Here are my suggestions (not only for you @obahareth but you may give some a go):

  • In UglifyJsPlugin which you are likely using during production build, you could exclude pdf.js and pdf.worker.js files from minification using exclude option.
  • pdfjs-dist come with pre-minified versions of their libs, but their entry files are not using them by default. You can use NormalModuleReplacementPlugin to load pdf.min.js instead of pdf.js and pdf.worker.min.js instead of pdf.worker.js. Make sure to exclude minified files from minification using option mentioned in bullet point above.
  • Try and make a use of two other UglifyJsPlugin options: cache to ensure you're not processing the same, unchanged file over and over, and parallel to use multi-process.

Let me know if anything had an especially good impact on your build performance!

@HillLiu
Copy link

HillLiu commented Feb 28, 2018

A webpack solution

HTML

<script src="//cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/[email protected]/web/pdf_viewer.min.js"></script>
<script>
PDFJS.workerSrc = "//cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js";
</script>

Webpack external configure

{
 "externals": {
    "pdfjs-dist": "pdfjsDistWebPdfViewer",
    "pdfjs-dist/lib/web/pdf_link_service": "pdfjsDistWebPdfViewer.PDFJS"
 }

Import

import {Document, Page} from 'react-pdf/dist/entry';

Result was improved.

from 6 min to 25sec.

real 0m25.909s
user 0m24.120s
sys 0m0.448s

@DanielRuf
Copy link
Contributor

Parcel is not a solution but an alternative which is quite new and has more issues.

You can easily switch to uglify-es in your webpack config and webpack 4.8 and its docs should provide the needed information.

@DanielRuf
Copy link
Contributor

Personally I would not use CDNs but create a copy task and set the externals. Also Google Closure compiler can produce smaller files of pdf.js.

@DanielRuf
Copy link
Contributor

The externals solution just circumvents the issue by referencing an "external" script which is not part of the bundling.

I highly advice to not bundle such big libraries, use the CommonsChunkPlugin, AggressiveSplittingPlugin + externals.

@HillLiu
Copy link

HillLiu commented May 12, 2018

@DanielRuf Do you like share more information about your experience for webpack4 with react-pdf?
Such as how fast in compile time.

It's useful let us know webpack4 is good to go.

BTW anyone interested in enable uglify-es with "webpack3", you could just install newer version uglifyjs-webpack-plugin and set it in config which just like @DanielRuf talking about.
There is a sample. "https://github.com/react-atomic/reshow/blob/master/packages/reshow-app/src/webpack.client.js"

@DanielRuf
Copy link
Contributor

webpack 4 in general is great and subsequent builds will be faster thanks to the new cache.

Also see https://github.com/webpack-contrib/uglifyjs-webpack-plugin#uglifyjs-webpack-plugin

Regarding pdf.js: I highly suggest using the externals option for this, webpack will exclude ith then from the bundle during the bundle generation.

Why not use the dist min file of pdf.js and load it as extra script tag as it was already recommended? This provides the best performance during development.

@HillLiu
Copy link

HillLiu commented May 12, 2018

@DanielRuf Yes, I use external solution, and share my use case above, it get large improve (from 6 min to 25sec.)

Webpack4 still in my evaluating, So happy to see some expert like you could share real number that more developer align with webpack4.

For me the main issue of webpack4 still have some minor problem with webworker, but it's not related with react-pdf.
react-atomic/reshow#4

@DanielRuf
Copy link
Contributor

I can do some benchmarking (also with different babel and browserslist settings) in the coming weeks if it would be helpful.

@yanivkalfa
Copy link

"externals": { "pdfjs-dist": "pdfjsDistWebPdfViewer", "pdfjs-dist/lib/web/pdf_link_service": "pdfjsDistWebPdfViewer.PDFJS" }

Your trick and basically everything that was mentioned here doesnt work. i've been playing with this for the last 12 hours. nothing.

@DanielRuf
Copy link
Contributor

I think this is not the global variable which is exported from pdf.js.

See https://webpack.js.org/configuration/externals/

Did you try pdfjsDistBuildPdf?
And the viewer should be pdfjsDistWebPdfViewer.

@DanielRuf
Copy link
Contributor

Related: mozilla/pdf.js#9545

@DanielRuf
Copy link
Contributor

In newer releases: mozilla/pdf.js#9565

@DanielRuf
Copy link
Contributor

All in all, we can only help if we have the code and config files of your project.

@yanivkalfa
Copy link

I've tried HillLiu's exact example, it doesnt break the code, but i dont get any build boost. especially when i use: react-pdf/build/entry.webpack (needed for worker) to get the page/doc, I've tried Other versions as well didnt work.

I've also tried pdfjsDistBuildPdf and several others.

i play alot with externals and managed to make externals of all the needed dep, and still no build boost.

i use NormalModuleReplacementPlugin as well to use the native minified, and excluded the them as well.. no build boost.

@yanivkalfa
Copy link

And you are right about the code example. i will do that when i wake up its a bit late.

@DanielRuf
Copy link
Contributor

webpack --progress --profile probably shows some more stats.

If it does not work in your example and there is no difference, then something seems to be wrong here.

@DanielRuf
Copy link
Contributor

A simple benchmark with a quick sample (without externals).

Hash: 67461963d3c7a380ebe2
Version: webpack 4.28.3
Time: 13478ms
Built at: 01/02/2019 7:55:14 AM
                 Asset     Size  Chunks                    Chunk Names
               main.js  435 KiB       0  [emitted]  [big]  main
vendors~pdfjsWorker.js  725 KiB       1  [emitted]  [big]  vendors~pdfjsWorker
Entrypoint main [big] = main.js
[20] (webpack)/buildin/global.js 472 bytes {0} [built]
[34] ./src/index.js 3.76 KiB {0} [built]
[40] zlib (ignored) 15 bytes {0} [optional] [built]
[41] fs (ignored) 15 bytes {0} [built]
[42] http (ignored) 15 bytes {0} [built]
[43] https (ignored) 15 bytes {0} [built]
[46] (webpack)/buildin/module.js 497 bytes {0} [built]
    + 77 hidden modules

With the externals:

externals: {
    "pdfjs-dist": "pdfjsLib",
    "pdfjs-dist/build/pdf.worker.js": "pdfjsWorker"
}
yarn run v1.12.3
$ webpack
Hash: 5efe836747d41e38755a
Version: webpack 4.28.3
Time: 2987ms
Built at: 01/02/2019 8:12:40 AM
  Asset      Size  Chunks             Chunk Names
main.js  93.8 KiB       0  [emitted]  main
Entrypoint main = main.js
[18] external "pdfjsLib" 42 bytes {0} [built]
[31] ./src/index.js 3.76 KiB {0} [built]
    + 64 hidden modules

With the profile flag:

yarn run v1.12.3
$ webpack --progress --profile
1156ms building                                                                 
2ms finish module graph                             
0ms sealing                                
0ms basic dependencies optimization 
2ms dependencies optimization                           
1ms advanced dependencies optimization 
0ms after dependencies optimization 
4ms chunk graph 
0ms after chunk graph                          
0ms optimizing 
0ms basic module optimization 
0ms module optimization 
0ms advanced module optimization 
1ms after module optimization 
1ms basic chunk optimization                             
0ms chunk optimization 
4ms advanced chunk optimization                         
0ms after chunk optimization 
1ms module and chunk tree optimization 
0ms after module and chunk tree optimization 
0ms basic chunk modules optimization 
1ms chunk modules optimization                           
0ms advanced chunk modules optimization 
1ms after chunk modules optimization 
0ms module reviving                 
1ms module order optimization                                
0ms advanced module order optimization 
0ms before module ids 
0ms module ids 
2ms module id optimization 
0ms chunk reviving                 
1ms chunk order optimization                               
0ms before chunk ids 
1ms chunk id optimization                          
0ms after chunk id optimization 
1ms record modules                 
0ms record chunks                 
9ms hashing 
1ms content hashing                         
0ms after hashing 
0ms record hash 
0ms module assets processing 
8ms chunk assets processing 
1ms additional chunk assets processing 
0ms recording 
0ms additional asset processing 
19ms chunk asset optimization             
1ms after chunk asset optimization 
0ms asset optimization 
0ms after asset optimization 
0ms after seal 
3ms emitting 
1ms after emitting                  
Hash: 5efe836747d41e38755a
Version: webpack 4.28.3
Time: 1239ms
Built at: 01/02/2019 8:17:21 AM
  Asset      Size  Chunks             Chunk Names
main.js  93.8 KiB       0  [emitted]  main
Entrypoint main = main.js
[18] external "pdfjsLib" 42 bytes {0} [built]
     [31] 771ms -> [24] 34ms -> factory:38ms building:93ms dependencies:171ms = 1107ms
[31] ./src/index.js 3.76 KiB {0} [built]
     factory:56ms building:715ms = 771ms
    + 64 hidden modules

Some info about my system:

osquery> SELECT cpu_brand, cpu_physical_cores, cpu_logical_cores, physical_memory, hardware_model FROM system_info;
+-------------------------------------------+--------------------+-------------------+-----------------+-----------------+
| cpu_brand                                 | cpu_physical_cores | cpu_logical_cores | physical_memory | hardware_model  |
+-------------------------------------------+--------------------+-------------------+-----------------+-----------------+
| Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz | 4                  | 8                 | 17179869184     | MacBookPro10,1  |
+-------------------------------------------+--------------------+-------------------+-----------------+-----------------+

You can find the code at https://github.com/DanielRuf/webpack-react-pdf

@DanielRuf
Copy link
Contributor

The viewer is generally more and is the viewer app in Firefox.
Let me know if your setup is different and which version of react-pdf you use. You can see the imports at https://unpkg.com/[email protected]/dist/pdf.worker.entry.js for the big worker.

@HillLiu
Copy link

HillLiu commented Jan 3, 2019

@yanivkalfa I think it depend on how you use react-pdf.

you could take a look my import sample.
#93 (comment)

and check the js.
https://cdn.jsdelivr.net/npm/[email protected]/dist/entry.js

In webpack it should replace pdfjs-dist to an empty object,
so you could also inspect the webpack bundle size, if your size is not change.
It's mean your use case is not in my same way.

And mention by @DanielRuf , if you use new version, you probably need change external parameter to new one.

I've tried HillLiu's exact example, it doesnt break the code, but i dont get any build boost. especially when i use: react-pdf/build/entry.webpack (needed for worker) to get the page/doc, I've tried Other versions as well didnt work.

I've also tried pdfjsDistBuildPdf and several others.

i play alot with externals and managed to make externals of all the needed dep, and still no build boost.

i use NormalModuleReplacementPlugin as well to use the native minified, and excluded the them as well.. no build boost.

@yanivkalfa
Copy link

yanivkalfa commented Jan 10, 2019

@DanielRuf

A simple benchmark with a quick sample (without externals).
You can find the code at https://github.com/DanielRuf/webpack-react-pdf

Which files to you link in the HTML? which variable do you use?
and lastly will that also work for worker ?

@DanielRuf
Copy link
Contributor

Which files to you link in the HTML?

The bundled ones.
I did no further setup as I did not get any example project to reproduce the issue. In general webpack is fast and in my case processed the big files. See the profiling information #93 (comment)

@yanivkalfa
Copy link

@DanielRuf so you didnt linked the file in the HTML like :

<script src="//cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/[email protected]/web/pdf_viewer.min.js"></script>
<script>
PDFJS.workerSrc = "//cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js";
</script>

and then use the externals ?

@DanielRuf
Copy link
Contributor

This makes not much difference for the bundling process.

@pedro-lb
Copy link

pedro-lb commented Nov 7, 2019

Hey guys,

In our case the problem is the bundle size of pdf.js and pdf.worker.js. The latter is bigger than the whole index.js of the application, making page loads pretty slow:

image

Is there anything we can do in this case?

@DanielRuf
Copy link
Contributor

In our case the problem is the bundle size of pdf.js and pdf.worker.js. The latter is bigger than the whole index.js of the application, making page loads pretty slow:

Please see the previous comments =) This was already mentioned and with a few solutions.

#93 (comment)

@masbaehr
Copy link

masbaehr commented May 4, 2020

I also was struggling with the huge bundle size as soon as i included react-pdf. I'm sorry to say, but all the solutions provided were not satisfying. Either too hard to configure, half-baked or they just "moved" the problem to a different spot. Ultimately i refactored my app to use vanilla pdf.js in a local <iframe>. This will make sure that pdf.js and pdf.worker.js are completely sandboxed - they won't ever cause blocking of UI on the main frame, and the application bundle is kept clean. So if you have performance issues, i really suggest you should try as well. To communicate with the app from the iframe you can use the message api.

@DanielRuf
Copy link
Contributor

I also was struggling with the huge bundle size as soon as i included react-pdf. I'm sorry to say, but all the solutions provided were not satisfying. Either too hard to configure, half-baked or they just "moved" the problem to a different spot.

Well, much code is much code, unless you delete many parts of it.
Every transpiler / compiler takes longer when you input more code. As you already write, it can be just optimized but not fully solved.

The only viable solutions are to define it as externals and ship the dist file directly to the client. Reprocessing of the big files makes not much sense.

@DanielRuf
Copy link
Contributor

To communicate with the app from the iframe you can use the message api.

...with proper protection (hostname check, do not use * and so on.

It would be great if you could provide your suggestion as PR for the docs. This would help many developers.

@masbaehr
Copy link

masbaehr commented May 5, 2020

To communicate with the app from the iframe you can use the message api.

...with proper protection (hostname check, do not use * and so on.

It would be great if you could provide your suggestion as PR for the docs. This would help many developers.

Actually if the iframe is within your domain you don't need any "special" protection measure. But sure - if you deploy the PDF.js frame somewhere else then you should do that.

Here's a fully working example i've put on codesandbox. This shows how to embed pdf.js completely sandboxed from the react app. Hope it helps someone :)
https://codesandbox.io/s/stoic-silence-j9ye7?file=/src/index.js

pdf.js related files are under /public/lib

@Krzyku
Copy link

Krzyku commented Dec 18, 2020

Shot in the dark: maybe we can mark this package as a side-effect free in package.json (by the way - is it? 😅)?
And the second shot: is it possible to try pdfjs instead of pdfjs-dist and perform tree shaking in this library?

@ghost
Copy link

ghost commented Jan 4, 2021

Does anyone have an updated way of accomplishing this? I tried doing the method with the externals, but I was receiving an error:

Uncaught TypeError: Cannot read property 'GlobalWorkerOptions' of undefined

from this line in react-pdf/dist/entry.webpack:

if (typeof window !== 'undefined' && 'Worker' in window) { _pdfjsDist["default"].GlobalWorkerOptions.workerPort = new _pdfWorkerEntry["default"](); }

I'm running react-pdf v 4.1.0

@ssteuteville
Copy link

Any advice for implementing the externals solution without ejecting from CRA?

@wojtekmaj
Copy link
Owner

wojtekmaj commented Apr 1, 2021

Please kindly check React-PDF v5.3.0-beta.2, in which improvements regarding loading PDF.js worker were made.

These changes should result in 10-15 seconds faster Webpack builds, as well as you should be able to use Webpack entry file in Create-React-App, without the need of using external CDN for hosting worker.

Let me know what you think in #748!

alexandernanberg pushed a commit to alexandernanberg/react-pdf that referenced this issue Jun 12, 2024
* Added CONTRIBUTING.md (optional)

* Added backers and sponsors on the README

* Added call to donate after npm install (optional)

* Added .github/ISSUE_TEMPLATE.md

* Remove badges

* Remove sponsors

* Update yarn.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.