-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jest-haste-map] reduce the number of lstat calls in node crawler #9514
Conversation
Hi wonnage! Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
Hmm, it looks like this option only exists in node > 10 - will add a check for that. |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, thank you!
@wonnage if you missed it - this is failing unit tests, on both old and new versions of node, with the same error. Probably just an error in the tests, but still 🙂 On that note, it would be nice to add some more unit tests, maybe spying on |
Codecov Report
@@ Coverage Diff @@
## master #9514 +/- ##
==========================================
+ Coverage 65.01% 65.04% +0.02%
==========================================
Files 283 283
Lines 12148 12155 +7
Branches 3009 3013 +4
==========================================
+ Hits 7898 7906 +8
Misses 3607 3607
+ Partials 643 642 -1
Continue to review full report at Codecov.
|
setTimeout( | ||
() => | ||
callback(null, [ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you're wondering - I mirrored the existing mocks to look like a fake fs.Dirent object here. I thought about coming up with something fancy to avoid this duplication (e.g defining the mocks in a map and then reduce
ing it to get strings/dirents as needed) but it seemed more complicated than it was worth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let's try to keep it simple here
{ | ||
isDirectory: () => false, | ||
isSymbolicLink: () => true, | ||
name: 'symlink', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also noticed that there weren't any existing tests for symlinks that I could find, so I added one here. It doesn't change the result of any existing tests (since we ignore symlinks) but will affect the lstat call-counting tests I added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice, thank you! we'll hopefully be landing some symlink support soonish, so these tests should come in handy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, I like it! Would love some other reviewers to take a look before merging, especially @cpojer
This reverts commit 758c76a.
Looks good, although I'm curious about the actual speed gains. Fwiw, if you rely on this speed up, I recommend installing watchmen (if you aren't on Windows). |
@cpojer yeah, we're doing some weird stuff with running metro in a OSX docker container, which seems to have super slow file IO. We can ignore large swaths of our monorepo to speed this up, but watchman doesn't support wildcard ignore_dirs, and we don't want to manually enumerate all of them. So it winds up taking forever crawling and hashing everything (since metro wants sha1s). |
Historically haste map performance in docker has been really bad. It’s quite possible that changing strategy may lead to significant wins. |
Also, any chance you could share before and after benchmarks for this change? |
@cpojer I don't have anything scientific, but it was previously taking ~90s to scan our repo and this change cut it down to 30s. We're watching roughly 500k files (mostly in node_modules). |
Nice!! |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
// This logic is unnecessary for node > v10.10, but leaving it in | ||
// since we need it for backwards-compatibility still. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yepitschunked we're about to drop node 10 (#12220), which part of the logic does this refer to? The entire stat
or just specific parts of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR is old, but would love your comment over in that PR 🙂
Summary
fs.readdir
has a withFileTypes option that will tell us if the file is a symlink or directory. Therefore, there's no need to make and wait for an lstat call before taking the appropriate actions for those file types (skip and recurse, respectively).This drastically improves the initial crawl time when using the node watcher.
Test plan
Applied changes and ran it against my project. The activeCalls counter is a little tricky, but I'm fairly certain I've handled it correctly here since I'd expect an infinite loop if I got it wrong.