-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cache.json is large due to containing multiples copies of the same data #345
Comments
Did some more discovery on this today and I realize what the crux of the issue is for the file size concerns, anyway. So for every page that gets serialized, that will make a GraphQL call (e.g. per ${page}-template.js), cache.js will create a JSON file of that query data for every one of those. So even for something like So when we I recall now as part of #115 one of the reasons for this approach was that trying to read / write to "shared" .json files asynchronously was resulting in thread unsafe writing operations to the same file at the same time, leading to corrupted JSON that would fail to get read / parsed correctly in serialize.js. So I did try again to generate cache.json files per query / per top level route, and now we see exactly what we would expect in count and size. // hash against the query instead
const md5 = crypto.createHash('md5').update(query).digest('hex'); Of course, this returns us to the problem of overlapping writes from time to time, even if adding something like this if (!fs.existsSync(targetFile)) {
await fs.mkdirs(targetDir, { recursive: true });
await fs.writeFile(path.join(targetFile), cache, 'utf8');
} query from route /guides/netlify-cms
query hash 157d00b65a839ebbd1a72ae5a3884080
==================================
==================================
==================================
query from route /plugins/
query hash 157d00b65a839ebbd1a72ae5a3884080
==================================
==================================
query from route /guides/cloudflare-workers-deployment
query hash 157d00b65a839ebbd1a72ae5a3884080
==================================
==================================
query from route /plugins/index-hooks
query hash 157d00b65a839ebbd1a72ae5a3884080
==================================
SyntaxError: /Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/public/docs/2bc8e256a25844b37c22af93673c67e3-cache.json: Unexpected token B in JSON at position 3343 I think though maybe since we won'y really need deepmerge with this solution, and so it will be a fair tradeoff to add proper-lockfile instead, and then I think things will return to normal? (though we can of course still continue to optimize data fetching even further, but we can do that in other issues). |
So as far as the overfetching issue goes, I think the basic client logic is mostly right. developIn development mode, build (serve)When running the AnalysisTo compare, if we look to see what the Which would be at most two renders, in theory:
So to unpack everything
Next Steps
|
Made some follow up issues
|
Type of Change
Summary
Moving the discussion from #317 (comment), but after the
0.5.0
release, there are some serious concerns with cache.json generation. Docs page is transfering over 2.6MB of JSON data!fetch
ed (why!?)This also means all index.html pages are huge too...
Details
Thoughts on quick wins here:
fetch
ing by caching in the client. (but would also be good to know why this is happening)For something like docs/cache.json, this could bring the size down to 35k! We should also try and add a "budget" spec for this to make sure the size can't explode again without it failing at least.
The text was updated successfully, but these errors were encountered: