Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save/restore yarn cache #654

Open
swrobel opened this issue Feb 1, 2018 · 24 comments
Open

Save/restore yarn cache #654

swrobel opened this issue Feb 1, 2018 · 24 comments

Comments

@swrobel
Copy link

swrobel commented Feb 1, 2018

Every single one of our deploys is taking ~50s to install packages via yarn. This should be near-instantaneous when yarn.lock hasn't changed. It appears that the nodejs buildpack already takes care of saving/restoring the yarn cache.

@schneems
Copy link
Contributor

You can get caching by using the official node buildpack

$ heroku buildpacks:add heroku/nodejs -i 1

@grk
Copy link

grk commented Mar 22, 2018

This makes yarn install run twice on each deploy though.

@schneems
Copy link
Contributor

It should be cached on the second run and nearly instantaneous. Much better than only running once but having to install from scratch.

@swrobel
Copy link
Author

swrobel commented Mar 23, 2018

@schneems alas, not really, since yarn will still wipe node_modules & re-symlink everything from the cache every time it's run, it's still a 10-15s process

@swrobel
Copy link
Author

swrobel commented Mar 29, 2018

Aaaaand it turns out it's even worse than I thought: yarnpkg/yarn#932

Yarn now runs 3x with both buildpacks, building native packages each time:

  1. nodejs: dependencies + devDependencies
  2. nodejs: prune devDependencies
  3. ruby

Deploy times are now officially insane with both buildpacks

@schneems
Copy link
Contributor

schneems commented May 7, 2018

One option is to use the heroku/nodejs buildpack which does caching and all that fun stuff and then manually disable the yarn:install task in your Rakefile. I think it's something like this:

Rake::Task["yarn:install"].clear
task "yarn:install" do
  # nothing here
end

@voter101
Copy link

One possible solution is to build with flag YARN_PRODUCTION set to true. That does not prevent yarn to run three times, but they don't reinstall dev dependencies all the time.

@joevandyk
Copy link

We are running into this as well. Deploy times for us are approaching 7 minutes.

@luccasmaso
Copy link

I'm using both buildpacks too with my rails app + webpacker. It tuns out that every deploy yarn install is executed twice and does not make use of cache, resulting in nearly 3-5 minutes deploy. I've tried Rake::Task["yarn:install"].clear but no success. I'm kind of stuck here. Thanks

@ericboehs
Copy link

This seems like something the community should care about. Yarn caching in the nodejs buildpack has supposedly been around for 2+ years. Our yarn install takes 50+ seconds each deploy. That's  25% of build time when our webpacker cache is used.

I'd like to take a look at this soon and see what can be done.

@joevandyk
Copy link

@ericboehs that would be awesome!

@ericboehs
Copy link

ericboehs commented Mar 27, 2019

Schneems suggestions seem to be helpful. Sort of.

Here's how you do it:

  • Add the nodejs buildpack before your ruby buildpack. It will cache node_modules but not remove it so that the ruby buildpack can reuse it without having to move it.
  • Set YARN_PRODUCTION to true to skip the dev dependencies stuff
  • Clear the yarn:install task and redefine it as schneems recommended.

Yarn will run once and will reuse caches so that install time is almost instant (a few seconds). Unfortunately the setup/teardown time for the nodejs buildpack doesn't gain you much.

The above changes saved 7 seconds on my build time compared to running yarn install from scratch every build but without the nodejs buildpack. My from-scratch yarn install time is 51 seconds. If yours is higher than this, then you may see a bigger savings than 7 seconds.

For now, I'm leaving my deploy process as is. For me, I can't justify the added complexity for 7 seconds.

It would seem caching node_modules/yarn within the ruby buildpack would save the most time. In our app's case, I think this would save another 10-20 seconds, if not more. This would be very welcomed but isn't high priority for our company. I hope someone else in the community (or within Heroku) is able to implement this.

@ericboehs
Copy link

ericboehs commented Mar 27, 2019

I have done some primitive caching of the node_modules here:
ericboehs/heroku-buildpack-ruby-yarn-cache@f851a95 (this isn't production ready as it will need cache pruning).

It seems to be almost identical in execution time to adding the nodejs buildpack. (Edit: See my next comment below.) The caching/restoration of node_modules just takes a really long time.

If our yarn install time gets over 60 seconds, I'll probably end up adding the nodejs buildpack (but not clearing the yarn:install task as it only saved me a couple seconds). Until then, I'll continue to use the official ruby buildpack.


Unrelated but useful for timing deploys:

unbuffer git push staging | ts -s "%H:%M:%.S"

This will prepend each line of output with a timestamp for when the line was buffered. Great for all kinds of commands, not just deploys. You'll need the unbuffer binary which is available for macOS via brew install expect.

@ericboehs
Copy link

ericboehs commented Mar 27, 2019

Hmmm. After testing my caching buildpack a few more times, it seems it does indeed save 10-20 seconds (my last run saved 19 seconds) compared to running with the nodejs buildpack.

Could someone else try out my buildpack and see if they see similar results?

@kpheasey
Copy link

kpheasey commented Jun 14, 2019

@ericboehs I've submitted a PR, #892, that improves upon your implementation. Just caching node_modules is not enough, we need to include ~/.yarn-cache, tmp/cache/webpacker, and public/packs too.

@jrochkind
Copy link

jrochkind commented Jun 2, 2021

That -- three years later -- there is no non-hacky supported/documented solution to caching yarn install for a Rails/webpacker app is definitely affecting my opinion of heroku support for rails and heroku stagnation generally.

@philippegirard
Copy link

philippegirard commented Jun 17, 2021

Hey any update on that issue. It's been more than 3 years. In my case the build-time is taking 20-25 minutes on a medium sized Rails application (100 users) with a React frontend. I think Rails is running rails assets:precompile to compile assets with webpack. It is getting worse and lately I have been getting SEGFAULTs on top when the compiling process run for above 25 minutes (occurs 1 time in 10):
image

Did someone has found a clean solution to this that can be implemented in a production environment? I am thinking about spliting the rails assets:precompile step into a github action (which support caching) (like this: https://stackoverflow.com/questions/21408804/heroku-rake-assetsprecompile-too-slow).

@kpheasey
Copy link

kpheasey commented Jun 17, 2021

@philippegirard I moved all my clients off Heroku and onto AWS CodePipeline+ECS+RDS. It requires some Dockerfile know how, or copy/paste lol. Despite the steep learning curve it's way cheaper, utilizes managed solutions (no worrying about downtime), and my push to deployment time is about 8-12 minutes for larger Rails+React applications.

@philippegirard
Copy link

@kpheasey that's my backup option. It's sad that the solution is to quit heroku. I still hope for a way to stay on heroku and do not go through the pain of the migration.

@krnjn
Copy link

krnjn commented Jun 18, 2021

We used to have this issue but used the split chunks plugin which helped us. Also be sure you are ignoring / not transpiling all of your node modules as that can sometimes cause issues. FWIW here's our setup that makes this run in ~10m in total on Heroku:

config/webpack/environment.js

const { environment } = require("@rails/webpacker");

// resolve-url-loader must be used before sass-loader
environment.loaders.get("sass").use.splice(-1, 0, {
  loader: "resolve-url-loader",
});

// default config from https://webpack.js.org/plugins/split-chunks-plugin/#optimizationsplitchunks
environment.splitChunks((config) =>
  Object.assign({}, config, {
    optimization: {
      splitChunks: {
        chunks: "all", // changed to "all" from "async" in default
        minSize: 20000,
        // minRemainingSize: 0, // this option does not work with webpack even included in default
        maxSize: 0,
        minChunks: 1,
        maxAsyncRequests: 30,
        maxInitialRequests: 30,
        automaticNameDelimiter: "~",
        enforceSizeThreshold: 50000,
        cacheGroups: {
          defaultVendors: {
            test: /[\\/]node_modules[\\/]/,
            priority: -10,
          },
          default: {
            minChunks: 2,
            priority: -20,
            reuseExistingChunk: true,
          },
        },
      },
    },
  })
);

module.exports = environment;

environment.loaders.delete("nodeModules");

config/webpack/production.js

process.env.NODE_ENV = process.env.NODE_ENV || "production";
const environment = require("./environment");
const CompressionPlugin = require("compression-webpack-plugin");
const TerserPlugin = require("terser-webpack-plugin");

environment.config.merge({
  devtool: "hidden-source-map",
  optimization: {
    minimizer: [
      new TerserPlugin({
        extractComments: false,
        parallel: true,
        cache: true,
        sourceMap: false,
        terserOptions: {
          parse: {
            ecma: 8,
          },
          compress: {
            ecma: 5,
            warnings: false,
            comparisons: false,
          },
          mangle: {
            safari10: true,
          },
          output: {
            ecma: 5,
            comments: false,
            ascii_only: true,
          },
        },
      }),
    ],
  },
});

// Insert before a given plugin
environment.plugins.prepend(
  "Compression",
  new CompressionPlugin({
    filename: "[path].br[query]",
    algorithm: "brotliCompress",
    test: /\.(ts|tsx|js|jsx|css|scss|png|jpeg|jpg|svg|eot|woff|woff2|ttf|otf)$/,
    compressionOptions: { level: 11 },
    threshold: 10240,
    minRatio: 0.8,
    deleteOriginalAssets: false,
  })
);

module.exports = environment.toWebpackConfig();

@swrobel
Copy link
Author

swrobel commented Jun 19, 2021

@krnjn while this is cool, it doesn't solve the problem of yarn re-installing packages from scratch on every build.

@philippegirard
Copy link

Hey @krnjn I was finally able to make it work.

I needed to change all the javascript_pack_tag with javascript_packs_with_chunks_tag in addition to your changes to make it work.

example

<div id="reactappv1"></div>
<%= javascript_packs_with_chunks_tag 'spa/app' %>

I wrote the steps I took to make it work on medium in case somebody else need more details:
https://philstories.medium.com/slow-build-on-heroku-with-rails-and-react-c6bef3a0ae2d

My builds went from 45 minutes to 3 minutes. Only the splitChunk plugin had a significant impact.

@schneems schneems closed this as completed Aug 5, 2021
@schneems
Copy link
Contributor

schneems commented Aug 5, 2021

My builds went from 45 minutes to 3 minutes.

I tried to pair with the former nodejs lang owner and essentially anything over 5 minutes was a MAJOR red flag. Even without caching. I sadly don't know more details though :(

is definitely affecting my opinion of heroku support for rails and heroku stagnation generally.

Well. I'm only one person. With a re-org I'm also having to contribute to several other languages and dev "salesforce functions" from scratch. I want to keep these things open because I want to keep visibility that they need to be done. However, it's also maybe sending the message that "I'm working on this currently" which is not true. I'm locking this issue but keeping it open :(

This buildpack is in the process of being deprecated/removed. I'm working on a re-write and the plan is to have nodejs buildpack do all the installation via cloud native buildpacks. https://github.com/heroku/buildpacks-ruby (AND that buildpack needs to be re-written in Rust as we've decided to standardize on buildpack languages since we're moving away from a single owner model.

If/when that CNB ever happens/ships then it will maybe open up the window to not have dev dependencies cleared between that buildpack and ruby. But that change needs an upstream fix to the CNB spec which will be work.

Reading through the issues, it seems the yarn cache is a tiny part of the overall experience people are seeing. It sounds like the original issue of yarn cache accounts for ~10 seconds or so.

The larger issue @philippegirard is pointing to is webpack/webpacker caching which isn't standardized and isn't even supported via heroku/nodejs yet. I've said in other threads and I'll say it again, supporting sprockets caching is likely the single most costly feature the buildpack has ever taken on (in terms of support tickets and debugging hours). Webpacker caching is much more "roll your own" compared to sprockets which makes supporting EVEN harder.

That -- three years later -- there is no non-hacky supported/documented solution

The fact that there's not an easy-to-use community solution is also telling. This is an extremely hard problem "caching and cache invalidation" is a FAMOUSLY hard problem. Harder still, in this case, is that webpack hasn't converged on a "just works" solution and webpacker/rails haven't adopted that solution. When a rails new sets up webpacker to work with caching out of the box, then the story changes dramatically.

I'm hoping after: finishing the Ruby CNB re-write, upstreaming CNB changes to the spec, re-writing the Ruby CNB in Rust, shipping salesforce function support for Ruby, then that process will open the door for more collaboration between ruby and node and maybe if we're having the whole team look at this issue of yarn and webpack caching (rather than just me) we'll be able to make progress where I've not been able to before.

Edits:

@schneems schneems reopened this Aug 5, 2021
@heroku heroku locked as too heated and limited conversation to collaborators Aug 5, 2021
@heroku heroku unlocked this conversation Aug 5, 2021
@heroku heroku locked and limited conversation to collaborators Aug 5, 2021
@heroku heroku unlocked this conversation Aug 6, 2021
@schneems
Copy link
Contributor

schneems commented Aug 6, 2021

I think locking was an over-reaction as clearly some developers are still getting value out of it, and talking about workarounds. These are valuable conversations for me as well. I want to re-open the thread to enable that conversation to continue here.

Also, I wanted to link to this from another webpacker conversation #892 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests