Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@W-13942827 - Reduced Javascript Heap Size Ramp Up #1535

Merged
merged 22 commits into from
Nov 22, 2023
Merged

Conversation

bendvc
Copy link
Collaborator

@bendvc bendvc commented Nov 10, 2023

Description

This PR is NOT a fix for the heap size issue which eventually leads to the dev server crashing, but it can help prevent it from happening as often. Mainly what we have here are a couple fixes that decrease how fast heap size grows. I've seen an decrease in code size increase per save leading to a 2x increase in saves before crashing.

NOTE: Before I get into what I did, I'll mention that the heap size increase doesn't have anything to do with webpack-hot-middleware, this same issue happens if we disabled hot module loading.

Ok so what does this PR do?

  1. We only enabled server source-maps when we are actually debugging (aka the inspect flag is set).

Why does this help? Well it's because those eval-source-maps end up in memory, and the base issue of the dev server not releasing its references means we have duplicate source maps in memory and they are big (like 30 or 200 mb big). So this means that during normal development we aren't creating those source maps in memory on the server.

  1. Lets not include 3rd party lib source-maps!

Currently we were including 3rd party library source maps if they exist, adding the source-map-loader plugin insures that 3rd party source-maps abide by the same "devtool" setting defined in our configs.

  1. Fix possible memory leak.

During heap analysis I saw a lot of references to jwt-decode in the commerce-sdk-react lib. I've upgraded the version to the latest major version, I think this helped as well.

What is the true source of the memory leak?

Our development environment uses webpack-hot-server-middleware' in order to allow changes to code to be observed in the browser after refreshing your page without having the restart the dev server, furthermore we don't event need to us refresh the browser as webpack-hot-middleware` does that for us. But that isn't important to our findings here.

How does webpack-hot-server-middleware work? well it will detect changes in your files and hook into the compilation step of webpack. Internally this plugin uses a package called require-from-string to compile a module from a "code" string. In in fact is where the issue arises, this package uses node's module object to compile it's code (using a private function none-the-less). I cannot say for certain, but internally in the module object there a reference persists to this code string and never lets it be collected by the garbage collector. Hence we end up having multiples of the webpage runtime (the app) in memory. Once we reach the limit (defaults to 4GB on mac machines) you'll crash.

Here is a link to a bug that I filed on the require-from-string project describing the issue and providing a small node application that highlights the issue. Hopefully this developer will be able to shed some light on the issue, but I believe that issues lies in Node itself.

If you'd like to test this script out create a node script with the content below:

#!/usr/bin/env node

var requireFromString = require('require-from-string');
var cycles = 100000;
var delay = 10;

(function step() {
  requireFromString(`/* ${'x'.repeat(10*512*1024)} */`, '/fake/path/to/file.js');

  if (--cycles > 0) {
    setTimeout(step, delay);
  }
}());

And run it like this:

node require-string-crash.js

Next wait for the application to crash. This should only take about 10 seconds.

Callouts.

  • @kevinxh Pointed out that we could look into clearing the require cache as a way to clear up that memory usage. I in fact looked into this after he mentioned it, but the way require-from-string works, I believe that no cache entry is ever made for the compiled code as it is manually using the _compile function. I confirmed that the code string isn't located in the cache by printing the contents and specifically looking for the entry using the key (file name) used.

Types of Changes

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Documentation update
  • Breaking change (could cause existing functionality to not work as expected)
  • Other changes (non-breaking changes that does not fit any of the above)

Breaking changes include:

  • Removing a public function or component or prop
  • Adding a required argument to a function
  • Changing the data type of a function parameter or return value
  • Adding a new peer dependency to package.json

Changes

  • Only include source maps in the server when using "inspect", doesn't make sense otherwise to include them if they are never going to be used. 90% of the time people are debugging client code, not on the server.
  • Add source-map-loader plugin.
  • Update some dependencies that looked suspect.

How to Test-Drive This PR

_This is going to be trick to test, to do this we'll compare memory usage with the develop branch. Follow the below steps on both develop and fix-source-maps-2 branch recording your results for comparison at the end. _

  • npm checkout BRANCH_NAME
  • git clean -dxf && npm ci // Get a clean state and install dependencies.
  • cd packages/template-retail-react-app
  • npm start
  • Now open the "Activity Monitor" app and filter on "node". There should be 4 node apps, the largest (by memory is the one that we are interested in).
  • Once the application is running in your browser and the first build is complete, record the memory size of the above node application.
  • Now in your code editor save a template file (I used pages/home/index.jsx) about 10 times.
  • Again record the memory usage.
  • Once you have done this to both branches, compare the values your recorded (I recorded a 1.49 GB vs 1.05 GB with this fix).

Checklists

General

  • Changes are covered by test cases
  • CHANGELOG.md updated with a short description of changes (not required for documentation updates)

Accessibility Compliance

You must check off all items in one of the follow two lists:

  • There are no changes to UI

or...

Localization

  • Changes include a UI text update in the Retail React App (which requires translation)

@bendvc bendvc added the wip Work in process label Nov 10, 2023
@bendvc bendvc requested a review from a team as a code owner November 10, 2023 17:30
@bendvc bendvc added ready for review PR is ready to be reviewed and removed wip Work in process labels Nov 20, 2023
@bendvc bendvc changed the title @W-13942827 - Reduced Javascript Heap Ramp Up WIP @W-13942827 - Reduced Javascript Heap Size Ramp Up Nov 20, 2023
Copy link
Collaborator

@bfeister bfeister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bendvc I did some googling around on this. Seeing some Next.js issues that suggest similar issues in their dev server. Their team merged a change that looks like it might be useful for us

https://github.com/vercel/next.js/pull/54988/files#diff-5f441142eb8ba9bcba515aa36fbc420a113307bd191e54a4e9e6c43529fc2613R1740

Specifically, the strategy of using a computed hash / SHA for an unchanging binary source like the problematic chakra-ui memory bloat seems like it could be useful for us as it gives a way to exercise fine-grained control over the memory allocation via that "addressable" ("hash-able" / "SHA-able") memory pointer so that we could overwrite it rather than just pile onto the memory leak

We have a distinct advantage over Next.js in that we know (in this case of the base implementation) our problematic libraries and could address them

@vcua-mobify
Copy link
Contributor

@bendvc Thanks for investigating and for putting this together. Your findings are clear and explain the cause of the memory leak to me.

in the Lets not include 3rd party lib source-maps! section, you say that '3rd party source-maps abide by the same "devtool" setting defined in our configs.' What is our default here? That we are not including any 3 party source maps unless the config in question (which config) is changed? Or are these also included when the --inspect flag is set?

@bendvc
Copy link
Collaborator Author

bendvc commented Nov 21, 2023

@bendvc Thanks for investigating and for putting this together. Your findings are clear and explain the cause of the memory leak to me.

in the Lets not include 3rd party lib source-maps! section, you say that '3rd party source-maps abide by the same "devtool" setting defined in our configs.' What is our default here? That we are not including any 3 party source maps unless the config in question (which config) is changed? Or are these also included when the --inspect flag is set?

By default our source map config was set to eval-source-map. Thats not bad, but if we were to set it to "none" we wouldn't include source maps in our build as expected, but our dependencies wouldn't get the message per-say and they would include theirs anyway. So because we conditionally include source maps now (when developing and using the inspect) there is a change that we don't want to include source maps, and we want our dependencies not to include theirs either.

@wjhsf
Copy link
Contributor

wjhsf commented Nov 21, 2023

Hopefully this developer will be able to shed some light on the issue, but I believe that issues lies in Node itself.

Definitely. This crashes for me after 30 iterations:

const Module = require('module')
for (let i = 0; !void console.log(i); i += 1) {
    const code = '/'.repeat(2 ** 27) // Nearly max string length. Let's crash fast!
    const filename = '/fake/path/to/file.js'
    new Module(filename, undefined)._compile(code, filename)
}

This crashes for me after 30 iterations. Interestingly, if you move the code declaration outside the loop, it doesn't crash (at least, not after the first 1000 iterations). I guess the node internals must be re-using the same reference, somewhere...

❗ This minimal repro does not crash in node 20 after 1000 iterations!

@bendvc
Copy link
Collaborator Author

bendvc commented Nov 21, 2023

Hopefully this developer will be able to shed some light on the issue, but I believe that issues lies in Node itself.

Definitely. This crashes for me after 30 iterations:

const Module = require('module')
for (let i = 0; !void console.log(i); i += 1) {
    const code = '/'.repeat(2 ** 27) // Nearly max string length. Let's crash fast!
    const filename = '/fake/path/to/file.js'
    new Module(filename, undefined)._compile(code, filename)
}

This crashes for me after 30 iterations. Interestingly, if you move the code declaration outside the loop, it doesn't crash (at least, not after the first 1000 iterations). I guess the node internals must be re-using the same reference, somewhere...

❗ This minimal repro does not crash in node 20 after 1000 iterations!

Thanks for testing that on node 20. I was going to do that but only tested on 16 and 18 which we support. Adding node 20 support might actually be the easier thing to do to fix this issue.

@bendvc
Copy link
Collaborator Author

bendvc commented Nov 21, 2023

@bendvc I did some googling around on this. Seeing some Next.js issues that suggest similar issues in their dev server. Their team merged a change that looks like it might be useful for us

https://github.com/vercel/next.js/pull/54988/files#diff-5f441142eb8ba9bcba515aa36fbc420a113307bd191e54a4e9e6c43529fc2613R1740

Specifically, the strategy of using a computed hash / SHA for an unchanging binary source like the problematic chakra-ui memory bloat seems like it could be useful for us as it gives a way to exercise fine-grained control over the memory allocation via that "addressable" ("hash-able" / "SHA-able") memory pointer so that we could overwrite it rather than just pile onto the memory leak

We have a distinct advantage over Next.js in that we know (in this case of the base implementation) our problematic libraries and could address them

Thanks for pointing that out, I'll give that a go. Similar to the work I did in this PR that would probably be filed under the "it helps, but doesn't solve the problem". But if we can chunk those node modules it would drastically reduce the size of the compile code to the point that it would require many many builds to crash.

UPDATE:

I believe what you read in the next.js code is a little bit of a red herring. The hash/SHA stuff there are using there is simply a mechanism for keying the chunk on a unique key when the name cannot be found. Both results of the condition, using the name vs using the hash see a reproducible name for each compilation, so they aren't trying to solve any kind of memory pointer overwrite thing.

That being said, we don't chunk our server code, but if we did I suspect it would only result in N smaller duplicates vs Y large duplicates where Y = N/# of chunks as the issue is downstream from this point in the webpack execution. Unfortunately I don't believe that our current configuration/architecture will allow use to chunk the render code since it's expecting a single file (one that exports a function compatible with getServerRenderer)

bendvc and others added 5 commits November 21, 2023 09:26
Its probably the save that relocking the projects resulted in getting newer vendor packages that might be larger.
This was most likely a result of re-locking the projects and creep based on vendor changes.
@bendvc
Copy link
Collaborator Author

bendvc commented Nov 21, 2023

Hopefully this developer will be able to shed some light on the issue, but I believe that issues lies in Node itself.

Definitely. This crashes for me after 30 iterations:

const Module = require('module')
for (let i = 0; !void console.log(i); i += 1) {
    const code = '/'.repeat(2 ** 27) // Nearly max string length. Let's crash fast!
    const filename = '/fake/path/to/file.js'
    new Module(filename, undefined)._compile(code, filename)
}

This crashes for me after 30 iterations. Interestingly, if you move the code declaration outside the loop, it doesn't crash (at least, not after the first 1000 iterations). I guess the node internals must be re-using the same reference, somewhere...
❗ This minimal repro does not crash in node 20 after 1000 iterations!

Thanks for testing that on node 20. I was going to do that but only tested on 16 and 18 which we support. Adding node 20 support might actually be the easier thing to do to fix this issue.

@bfeister @wjhsf With regards to our previous conversation, I generated a project and ran it with node 20 only to find out that the issue is not magically fixed 😢. I suspect that there is a closure type memory leak that is located somewhere not in Node, but potentially in the require-from-string or webpack-hot-server-middleware. In the heap snapshot I took after multiple compiles the same pesky code compilation string is repeated multiple times. In fact it's now even larger for the project installed and run using node 20 and npm 10 @ roughly 100mb.

bfeister
bfeister previously approved these changes Nov 21, 2023
packages/pwa-kit-dev/CHANGELOG.md Outdated Show resolved Hide resolved
@@ -33,7 +33,7 @@
"@loadable/babel-plugin": "^5.15.3",
"aws-sdk": "^2.1354.0",
"aws-serverless-express": "3.4.0",
"cosmiconfig": "^8.1.3",
"cosmiconfig": "8.1.3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pin to an exact version?

packages/pwa-kit-dev/src/configs/webpack/config.js Outdated Show resolved Hide resolved
packages/pwa-kit-dev/bin/pwa-kit-dev.js Outdated Show resolved Hide resolved
@bfeister bfeister self-requested a review November 21, 2023 21:50
bfeister
bfeister previously approved these changes Nov 21, 2023
wjhsf
wjhsf previously approved these changes Nov 21, 2023
@bendvc bendvc dismissed stale reviews from wjhsf and bfeister via 31e7237 November 22, 2023 22:59
@@ -536,7 +544,7 @@ const requestProcessor =
libraryTarget: 'commonjs2'
},
// use eval-source-map for server-side debugging
devtool: mode === development ? 'eval-source-map' : false,
devtool: mode === development && INSPECT ? 'eval-source-map' : false,
Copy link
Collaborator

@alexvuong alexvuong Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, would this prevent source-map on browser? when we are not starting the app with inspect mode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request processor is something that runs on the "server"... well at least on the browser. So if you wanted to debug it, you'd be adding the --inspect flag anyway.

@bfeister bfeister self-requested a review November 22, 2023 23:30
@bendvc bendvc merged commit 23705b4 into develop Nov 22, 2023
20 checks passed
@wjhsf wjhsf deleted the fix-source-maps-2 branch November 30, 2023 21:53
@wjhsf wjhsf mentioned this pull request Dec 8, 2023
12 tasks
@kriskw1999
Copy link

kriskw1999 commented Feb 6, 2024

Hi, as pointed out here https://sfcc-unofficial.slack.com/archives/C02KUCHDEKH/p1707145352551769 I investigated a little bit too about the memory leak and maybe found something helpful.

Version: 2.8.1

This evening I reduced as much as possible the project outside the node modules (literally kept 1 page and some files needed to run the app like ssr.js).

Then I have run the inspector on the ssr and profiled the memory before and after some savings. As you can spot in the minimal use case the only issuer is the Lodash keeping alive the webpack strings allocated in the memory.

To proof this (just for testing) you can place this snippet of code, replacing doneHandler, inside node_modules/webpack-hot-server-middleware/src/index.js and you will find that the lodash dependency disappears. I suggest in some way to force the cleanup on hot reload by subscribing on the hmr hook in some way.

It seems that on each hot reload it instantiates a Lodash global reference.

    const doneHandler = (multiStats) => {
        
        // this prevents to loadash to attach to the instance and helps garbage collection
        if(Object.keys(global).includes('_')) {
            delete global._
        }

        error = false;
        
        const serverStats = findStats(multiStats, 'server')[0];
        // Server compilation errors need to be propagated to the client.
        if (serverStats.compilation.errors.length) {
            error = serverStats.compilation.errors[0];
            return;
        }

        let clientStatsJson = null;

        if (clientCompilers.length) {
            const clientStats = findStats(multiStats, 'client');
            clientStatsJson = clientStats.map(obj => obj.toJson());
            
            if (clientStatsJson.length === 1) {
                clientStatsJson = clientStatsJson[0];
            }
        }

        const filename = getFilename(serverStats, outputPath, options.chunkName);
        const buffer = outputFs.readFileSync(filename);
        const serverRendererOptions = Object.assign({
            clientStats: clientStatsJson,
            serverStats: serverStats.toJson()
        }, options.serverRendererOptions);
        try {
            serverRenderer = getServerRenderer(filename, buffer, serverRendererOptions);
        } catch (ex) {
            debug(ex);
            error = ex;
        }
    };

This is the screenshot before the change with some saves:
image

This is the screenshot after the change and cleanup of lodash (as you can see the huge webpack string seems garbage collected):
image

If you have any doubt or question feel free to ask!

@bendvc
Copy link
Collaborator Author

bendvc commented Feb 15, 2024

Hi, as pointed out here https://sfcc-unofficial.slack.com/archives/C02KUCHDEKH/p1707145352551769 I investigated a little bit too about the memory leak and maybe found something helpful.

Version: 2.8.1

This evening I reduced as much as possible the project outside the node modules (literally kept 1 page and some files needed to run the app like ssr.js).

Then I have run the inspector on the ssr and profiled the memory before and after some savings. As you can spot in the minimal use case the only issuer is the Lodash keeping alive the webpack strings allocated in the memory.

To proof this (just for testing) you can place this snippet of code, replacing doneHandler, inside node_modules/webpack-hot-server-middleware/src/index.js and you will find that the lodash dependency disappears. I suggest in some way to force the cleanup on hot reload by subscribing on the hmr hook in some way.

It seems that on each hot reload it instantiates a Lodash global reference.

    const doneHandler = (multiStats) => {
        
        // this prevents to loadash to attach to the instance and helps garbage collection
        if(Object.keys(global).includes('_')) {
            delete global._
        }

        error = false;
        
        const serverStats = findStats(multiStats, 'server')[0];
        // Server compilation errors need to be propagated to the client.
        if (serverStats.compilation.errors.length) {
            error = serverStats.compilation.errors[0];
            return;
        }

        let clientStatsJson = null;

        if (clientCompilers.length) {
            const clientStats = findStats(multiStats, 'client');
            clientStatsJson = clientStats.map(obj => obj.toJson());
            
            if (clientStatsJson.length === 1) {
                clientStatsJson = clientStatsJson[0];
            }
        }

        const filename = getFilename(serverStats, outputPath, options.chunkName);
        const buffer = outputFs.readFileSync(filename);
        const serverRendererOptions = Object.assign({
            clientStats: clientStatsJson,
            serverStats: serverStats.toJson()
        }, options.serverRendererOptions);
        try {
            serverRenderer = getServerRenderer(filename, buffer, serverRendererOptions);
        } catch (ex) {
            debug(ex);
            error = ex;
        }
    };

This is the screenshot before the change with some saves: image

This is the screenshot after the change and cleanup of lodash (as you can see the huge webpack string seems garbage collected): image

If you have any doubt or question feel free to ask!

Thanks for the help @kriskw1999. It seems as though loadash isn't the only lib hanging on to globals. So although cleaning up references to lodash would help the inevitable doom of crashing, it would only postpone it. Looking as some heap snapshots there are global references coming from other libs like our commerce-sdk-isomorphic (see screenshot below).

image

Also, you see many other duplicate strings which leads me to believe this isn't just a lodash issue and trying to fix that reference isn't fixing the root of the issue. I'm predicting there is something else wrong with webpack-hot-server-middleware that I have not pinpointed yet.

What do you think?

@kriskw1999
Copy link

kriskw1999 commented Feb 16, 2024

Hi @bendvc @joeluong-sfcc , thanks for your support!

I have shared with you a private repository (I wasn't sure how the policy works) with the pwa-kit project "almost" without the memory leak.
To clear completely the memory leak in that project you can fork the repo and:

  1. use node v20.11.1 (important)
  2. use chrome (have not tested on other browsers)
  3. add the snippet of code as defined above (restart the server after doing this)
        // this prevents to loadash to attach to the instance and helps garbage collection
        if(Object.keys(global).includes('_')) {
            delete global._
        }
  1. profile the memory by changing as many times you want the home/index.jsx page (adding new random letters for example)

If you take away the snippet you can see that the memory leak returns again.

I am stuck on the investigation but I can share with you what I found:

  • there seems to be an issue with the WeakMap in the commerce-sdk-isomorphic for some reason
  • the issue in that library can be found as the profiler suggests in node_modules/commerce-sdk-isomorphic/lib/index.esm.js and if you use a formatter for js (like https://beautifier.io/) and you search for the WeakMap you can find it. This is generated in some way by the builder or transpiler, but this is the point where I am stuck.
  • the interesting thing seems to be that the minimal project shared works well when adding more components and stuff unless you put the sdk. I hope this fork can be an interesting point of start to find where the issue comes from, at least we have a reproducible way where the memory leak does not manifest at all.
  • from here we should figure out why the weakmap of the sdk lib is continuing taking memory

For your last point I think that the crucial string to keep allocated once is the one that starts with "/* * Attention which is the whole project transpiled and allocated as string and in my case is almost 4% of memory for each string of this kind.

@bendvc
Copy link
Collaborator Author

bendvc commented Feb 16, 2024

@kriskw1999 Thanks for the summary.

I previously took the code you provided for a test drive, and I was able to able to see positive results. But as I mentioned previously and you mentioned in your last message there is another global that is causing issues via commerce-sdk-react and looking into it its coming from __core-js_shared__ object from core-js. Now if this is something that is being introduced via a babel preset or from a library that we use (there is multiple libs using it) I don't really know right now.

The problem is that using the method you provided, I simply couldn't delete that reference using delete. I believe it has something to do with how that value was attached to the global, again I can't be 100% sure. On tops of all that, I did see another global reference coming from graceful-fs....

As you can see its hard to put these fires out being that they are coming from all over the place that may or may not be under our control.

That being said, I'm not giving up, I attempted to "cache" the global and re-write it after compilation in order to have a single solution instead of this whack-a-mole situation. I should have more time comes up this weekend to look into it some more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review PR is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants