Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(gatsby-plugin-mdx): prevent dupe file writes and simplify Babel step #25808

Closed
wants to merge 2 commits into from

Conversation

pvdz
Copy link
Contributor

@pvdz pvdz commented Jul 17, 2020

The cacheScope function would try to write the same file over and over again which is 10x more expensive than caching whether or not that file had already been written (or even fs.existsSync, but caching is much cheaper). Still needs some tweaking. This doesn't change much, it's just 4s on the 64k benchmark, so 0.1%, although it might be better for your disk :)

I'm not a super fan of the Map for caching this, though, since this means having to juggle it from a central point somehow. Or putting it global, which might also not be desirable. Additionally, the map needs to do file path + contents, not just contents (I think... but perhaps that's actually not necessary since the digest of the contents dictates the path).

Dropping the babel step in favor of a regex scrub saves a little bit more. Still only like 30s on the 64k benchmark (~1%).

The idea for the scrubbing is that the inputs must be valid ES6 import statements, which will have a from keyword followed by a quoted string. This is an invariant in the syntax and so it should be safe to run this regex on it rather than turn on the whole Babel engine. Note that the input strings may contain multiple import statements, hence the g flag and non-greedy match.

Need to verify this more thoroughly so shelving it before I go on vacation.

(This'll be two separate PRs)

pvdz added 2 commits July 16, 2020 15:25
The cacheScope function would try to write the same file over and over again which is 10x more expensive than caching whether or not that file had already been written (or even fs.existsSync, but caching is much cheaper). Still needs some tweaking. This doesn't change much, it's just 4s on the 64k benchmark, so 0.1%, although it might be better for your disk :)

Dropping the babel step in favor of a regex scrub saves a little bit more. Still only like 30s on the 64k benchmark (~1%).

The idea for the scrubbing is that the inputs must be valid ES6 import statements, which will have a `from` keyword followed by a quoted string. This is an invariant in the syntax and so it should be safe to run this regex on it rather than turn on the whole Babel engine. Note that the input strings may contain multiple import statements, hence the `g` flag and non-greedy match.

Need to verify this more thoroughly so shelving it before I go on vacation.
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Jul 17, 2020
@pvdz
Copy link
Contributor Author

pvdz commented Jul 17, 2020

@johno have a look, what do you think?

@pvdz pvdz added topic: MDX topic: performance Related to runtime & build performance and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Jul 17, 2020
@gatsby-cloud-staging
Copy link

Your pull request can be previewed in Gatsby Cloud: https://build-751556cc-4732-4718-b533-fd39d2f3fca8.staging-previews.gtsb.io

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: performance Related to runtime & build performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant