feat(gatsby): add two experimental flags to control cache chunk size #29039
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds two experimental env variables through which we can control and/or get more insight into chunk size during cache persistence. This is an attempt to unblock people stuck by #23627 without sacrificing performance on a general scale.
If the default heuristic is causing you to see the "Assertion `length <= kMaxLength" Buffer assertion error during build then it's possible your site is very large and the heuristic for storing the cache is failing to pick an adequate enough chunk size. We cannot trap this error as it's a nodejs internal assertion error so hopefully it finds your way through this PR or linked issue.
The
GATSBY_EXPERIMENTAL_CACHE_FORCE_CHUNK_SIZE
flag sets the chunk size to the given non-zero positive integer value. What this means is that the nodes array, which is the biggest piece of memory to be stored, is stored in arbitrary chunks of this many nodes.The
GATSBY_EXPERIMENTAL_CACHE_CHUNK_GUESS_STEP
flag does two things. Firstly, it allows you to control the heuristic by explicitly setting the step size for guessing how big the largest node is. Measuring the size of a node is a matter ofv8.serialize(node).length
and is very expensive at scale. By default we only scan every(nodes.length/step)
-th node, a very arbitrary number. You can set the step all the way to1
if you want for a completely accurate step size, at the price of time.Additionally, if you set
GATSBY_EXPERIMENTAL_CACHE_CHUNK_GUESS_STEP
to any value it will print the chunk size stats that it gathers so you can use that to take further action.