Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(mdx-loader): cache mdx/remark compiler instances #4997

Merged
merged 16 commits into from
Mar 31, 2022

Conversation

phryneas
Copy link
Contributor

@phryneas phryneas commented Jun 17, 2021

Motivation

Fix #4978

With this change, a new mdx compiler will be created once per webpack config instead of once per config AND file. This can have a massive impact as for example in the case of https://github.com/phryneas/remark-typescript-tools where every instantiation of the compiler also meant an instantiation of the remark plugin and hence a new instance of the TypeScript compiler.

A very similar change has been made over in the mdx-loader in https://github.com/mdx-js/mdx/pull/1468/files and you can compare the introduced changes with the original compile implementation over at https://github.com/mdx-js/mdx/blob/907bd26d83287180c7155c0fb65671b7cb55c98d/packages/mdx/index.js#L44-L47

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

This brings memory consumption of the redux-toolkit docs with my plugin (without workarounds) down from about 6GB to 2.8GB, while the plugin with workarounds comes down to 2.4GB (as that shares the compiler instance between the server and client config, but I erred on the more secure side with this PR and kept those separate).

@phryneas phryneas requested review from lex111 and slorber as code owners June 17, 2021 22:06
@facebook-github-bot
Copy link
Contributor

Hi @phryneas!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@phryneas
Copy link
Contributor Author

phryneas commented Jun 17, 2021

Hmm. Seems like this would need @mdx-js/mdx to be bounced to expose createCompiler. Since the docusaurus repo doesn't even install on my operating system, I won't be able to do that. Could you please take over for that part?

I have 1.6.22 running here and that works well, so no major version bump necessary.

@facebook-github-bot facebook-github-bot added the CLA Signed Signed Facebook CLA label Jun 17, 2021
@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Jun 18, 2021

I guess you should try fetching upstream first, since it seems your branch is quite old... In the lastest master I see @mdx-js/mdx already bumped to 1.6.21.

@phryneas
Copy link
Contributor Author

Damn, I could have sworn I checked out the docusaurus repo for the first time in my life yesterday!
But I must already have had a stale copy lying around. I'll take a look later.

# Conflicts:
#	packages/docusaurus-mdx-loader/src/index.js
Copy link
Collaborator

@slorber slorber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR

That looks like a nice idea overall, but I'm not sure it would work in the current form, as each cached compiler will capture the first filepath being compiled.

Maybe there's a way for a remark plugin to access current file path being processed, without passing it through options?

We can see the problem in practice by building the Docusaurus site:

3:45:39 PM: Error: Image ../assets/docusaurus-asset-example-banner.png used in docs/presets.md not found.
3:45:39 PM: at async Promise.all (index 0)

This image is not used in docs/preset.md, but in docs/guides/markdown-features/markdown-features-assets.mdx

...DEFAULT_OPTIONS.remarkPlugins,
[
transformImage,
{staticDir: reqOptions.staticDir, filePath: this.resourcePath},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this.resourcePath here makes me think this PR is not good enough.

Our remark plugins need to be able to access current file path to check if relative image/asset links are valid etc.

Copy link
Collaborator

@slorber slorber Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we may be able to access it thanks to the transform(node,vfile) and vfile.history[0]: https://github.com/vfile/vfile#vfilehistory

Will see how to refactor the PR with it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just took another look at https://github.com/mdx-js/mdx/pull/1468/files and they completely removed the filePath: this.resourcePath from the options object and only used it as parameter in the second argument of compiler.process - so it seems to be the accepted standard way to use that instead of accessing filePath from the options object.

@slorber slorber changed the title (mdx-loader) only create mdx compiler once per webpack config fix(v2): mdx-loader perf: cache mdx/remark compiler instances Jun 18, 2021
@Josh-Cena Josh-Cena added the pr: performance This PR does not add a new behavior, but existing behaviors will be more memory- / time-efficient. label Oct 30, 2021
@Josh-Cena Josh-Cena changed the title fix(v2): mdx-loader perf: cache mdx/remark compiler instances perf(mdx-loader): cache mdx/remark compiler instances Nov 7, 2021
@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Dec 18, 2021

@slorber Let's try to get this merged for the 2.0 release as well. I personally don't know much about Webpack / Remark file resolution and it seems you have a solution in mind?

@phryneas phryneas requested a review from Josh-Cena as a code owner December 18, 2021 10:22
@Josh-Cena Josh-Cena added this to the 2.0.0 milestone Dec 18, 2021
Copy link
Collaborator

@Josh-Cena Josh-Cena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, vfile fixes it!

@Josh-Cena Josh-Cena requested a review from slorber January 20, 2022 07:31
@Josh-Cena Josh-Cena added the status: awaiting review This PR is ready for review, will be merged after maintainers' approval label Jan 20, 2022
@netlify
Copy link

netlify bot commented Jan 20, 2022

[V2]

Built without sensitive environment variables

Name Link
🔨 Latest commit 31ae5f1
🔍 Latest deploy log https://app.netlify.com/sites/docusaurus-2/deploys/6245516d7e5055000966bf90
😎 Deploy Preview https://deploy-preview-4997--docusaurus-2.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions
Copy link

github-actions bot commented Jan 20, 2022

⚡️ Lighthouse report for the changes in this PR:

Category Score
🟠 Performance 62
🟢 Accessibility 100
🟢 Best practices 100
🟢 SEO 100
🟢 PWA 90

Lighthouse ran on https://deploy-preview-4997--docusaurus-2.netlify.app/

Copy link
Collaborator

@slorber slorber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks fine but this leads to a ton of new warnings in our case:
https://app.netlify.com/sites/docusaurus-2/deploys/61e9239e50d8f50008ce297d

image

Those are not brand new warnings apparently, also in prod:

image

It just became more visible now since we create like 16 MDX compilers due to all of our plugin instances.

I think we should try to fix these warnings first before merging this PR.

It might be warnings that only us have, due to some docs 🤷‍♂️ but let's figure this out

@Josh-Cena
Copy link
Collaborator

Mmm, true. Last time I inspected I didn't find anything useful about where we were missing these imports. Probably one in the deployment doc.

I'm actually surprised that this creates more warnings than before. I guess it's because every call to createCompiler creates a loader while calls to mdx are cached?

@Josh-Cena
Copy link
Collaborator

Okay, this problem is once again caused by the MDX infrastructure not present during blog feed generation 😅 Should be fixed by #6454

@slorber
Copy link
Collaborator

slorber commented Jan 26, 2022

Okay, this problem is once again caused by the MDX infrastructure not present during blog feed generation 😅 Should be fixed by #6454

Nop, there were some missing Tab/TabItem imports that I fixed

Also we have an issue with the npm2yarn plugin, not sure why we only see this now. The warning could as well be related to upgrading MDX and we didn't notice.

As it's very verbose I'd rather get this fixed before we merge the PR

@Josh-Cena
Copy link
Collaborator

Also we have an issue with the npm2yarn plugin, not sure why we only see this now. The warning could as well be related to upgrading MDX and we didn't notice.

I suppose you are talking about the npm2yarn plugin automatically importing Tabs if they are not present? That seems to be the case for both files you fixed?

@slorber
Copy link
Collaborator

slorber commented Feb 2, 2022

I suppose you are talking about the npm2yarn plugin automatically importing Tabs if they are not present? That seems to be the case for both files you fixed?

I didn't have time to investigate much, I just know that removing the remark plugin removes the warnings

@Josh-Cena
Copy link
Collaborator

Took a look and found the issue. It's because the npm2yarn plugin wrongly put the state outside the transformer, so it would share the same state across different MDX files, and cause later files to not have Tabs automatically imported.

@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Mar 31, 2022

Going to merge now! The behavior seems good. I have made sure we don't have stateful logic outside the transformer anywhere else, so caching shouldn't affect most users.

@Josh-Cena Josh-Cena merged commit 949a72e into facebook:main Mar 31, 2022
@Josh-Cena Josh-Cena removed the status: awaiting review This PR is ready for review, will be merged after maintainers' approval label Apr 1, 2022
@slorber
Copy link
Collaborator

slorber commented Apr 6, 2022

Thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Signed Facebook CLA pr: performance This PR does not add a new behavior, but existing behaviors will be more memory- / time-efficient.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docusaurus 2 memory leak with custom remark plugin.
4 participants