Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize haste map tracking of deleted files with Watchman. #8056

Merged

Conversation

scotthovestadt
Copy link
Contributor

@scotthovestadt scotthovestadt commented Mar 6, 2019

Summary

This is a minor PR to improve the performance of tracking deleted files by taking advantage of Watchman when available.

Currently, deleted files are tracked within jest-haste-map by:

  1. Making a shallow copy of the Haste Map files before building the file map.
  2. Filtering the shallow copy against the generated file map to remove all files that still exist.

Benchmarking this operation against a large locally-generated test Haste Map of 300k~ files with one deletion, the operation currently takes about 150ms on my machine and grows linearly with more files tracked. Using Watchman makes it almost free and only grows with the number of files changed.

I've updated the non-Watchman implementation to also track deleted files within the crawler to keep the interface consistent, although that update is neutral on performance.

Test plan

  • Benchmarked the performance to ensure what looked like a performance gain was one in practice.
  • Tested manually with and without Watchman to ensure deleted files were being picked up as expected.
  • Added tests for tracking deleted files with Watchman crawler when fresh and when not fresh.
  • Added tests for tracking deleted file with Node crawler.
  • Updated all related tests.

@codecov-io
Copy link

codecov-io commented Mar 6, 2019

Codecov Report

Merging #8056 into master will decrease coverage by <.01%.
The diff coverage is 90.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8056      +/-   ##
==========================================
- Coverage   62.36%   62.35%   -0.01%     
==========================================
  Files         262      262              
  Lines       10312    10316       +4     
  Branches     2491     2493       +2     
==========================================
+ Hits         6431     6433       +2     
  Misses       3307     3307              
- Partials      574      576       +2
Impacted Files Coverage Δ
packages/jest-haste-map/src/crawlers/node.ts 80.76% <100%> (+0.5%) ⬆️
packages/jest-haste-map/src/index.ts 83.23% <100%> (-0.35%) ⬇️
packages/jest-haste-map/src/crawlers/watchman.ts 88% <83.33%> (-1.02%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e32813...496bdae. Read the comment docs.

Copy link
Member

@cpojer cpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I like this change a lot! Nice work. I have one question inline about fresh instances.

I’m not sure if “deprecatedFiles” makes sense. Could we rename them to “removedFiles” (or “missingFiles”?) in a separate PR? What do you think?

@SimenB: is this good for you too?

@@ -148,6 +158,8 @@ export = async function watchmanCrawl(
// files.
if (watchmanFileResults.isFresh) {
files = new Map();
deprecatedFiles = new Map(data.files);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If watchman crashes and gives us a fresh instance, we just start from scratch, right? If so, this seems like unnecessary bookkeeping. Could you explain the idea behind this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Watchman crashes, I'm just getting a full list of files. I need to also know what files were deleted because I have to pass them back for use here:
https://github.com/facebook/jest/blob/master/packages/jest-haste-map/src/index.ts#L642

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code path (when Watchman is fresh), that map has all existing files removed as we iterate through the files, so it ends up being just a list of removed files and nothing else.

@scotthovestadt
Copy link
Contributor Author

@cpojer RE: Renaming from deprecatedFiles: totally agree. I think removedFiles would be the most clear. delete implies something that might not be true (maybe just moved somewhere else) and missing implies we want to find the file, IMO.

I'll submit a PR for it once this is merged in (feel free to provide any naming thoughts).

Copy link
Member

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Agreed on removedFiles as a better name. Make sure to get the rename in before we release 24.2 to avoid a breaking change 😀

packages/jest-haste-map/src/index.ts Outdated Show resolved Hide resolved
@scotthovestadt
Copy link
Contributor Author

scotthovestadt commented Mar 6, 2019

@SimenB Do you want the rename in this PR or a separate one? Also, I don't think we have to worry about breaking changes here since it's all in internal methods and doesn't touch the public API.

@SimenB
Copy link
Member

SimenB commented Mar 6, 2019

Oh, it's not exposed from build? If so it doesn't matter

@scotthovestadt
Copy link
Contributor Author

@SimenB Rename didn't touch anything not already touched in this PR, pretty straightforward, so done now.

Feel free to merge it in when build passes (and you're ready for the merge-- not sure what your process is). I can't. :)

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants