-
Notifications
You must be signed in to change notification settings - Fork 780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI: Further improve startup performance #1476
Conversation
I'll try to test this branch out this weekend or early in the week, but JIC I don't get to it broccoli-typescript-compiler was a good OSS repo to test the speed out in. I think I made a comment over in @SparshithNR's PR with instructions for reproducing a pathological case... |
4 of 7 dependencies for the QUnit CLI are 'minimatch' and its three dependencies. A fairly popular alternative is micromatch's picomatch library, which is a drop-in replacement for what we need. Ref <http://npm.broofa.com/[email protected]>.
This follows-up 837af39, which introduced the ignore list to the `--watch` mode when `run.restart()` calls `findFiles()`. This was a big improvement for watch mode, but it still left two major issues behind: 1. It did not apply to when running qunit normally (without --watch). This could be easily fixed by passing IGNORED_GLOBS in run() down to getFilesFromArgs/findFiles the same way. 2. We are still scanning all other top-level directories that are not ignored, even if they do not match any part of the glob pattern. I investigated numerous matching libraries (picomatch, minimatch, etc.) but it seems none of of them offer a way to determine whether a given directory could contain a matching file. E.g. something that returns false for `src/` when given `test/*.js`, but returns true for `src/` if given `**/*.js`. So rather than approaching it from the angle of a matching library, I went looking for various "proper" glob libraries instead which handle both the pattern matching and the directory scanning responsibilities together. I considered: * isaacs/node-glob <http://npm.broofa.com/[email protected]>. 10 dependencies, including 3 for minimatch. Not an option. * mrmlnc/fast-glob <http://npm.broofa.com/[email protected]> This uses picomatch, which is promising as being a popular dependency-free alternative to minimatch. But, it unfortunately does add up to 16 dependencies in total. * Crafity/node-glob <http://npm.broofa.com/[email protected]> A mostly self-contained implementation with 2 dependencies ('async' and 'glob-to-regexp'). Unfortunately, it seems to not do a limited traversal but rather * terkelg/tiny-glob <http://npm.broofa.com/[email protected]> Another self-contained implementation, with two local dependencies by the same author. Claims to be much faster than both fast-glob and node-glob. I think we have a winner. Ref #1384.
@rwjblue Thanks, I forgot about that one.
Quite the gain :) |
4 of 7 dependencies for the QUnit CLI are
|
I just wanted to say thank you so much for this! the CLI is so much faster!!! |
} ); | ||
|
||
watchedFiles.forEach( file => delete require.cache[ path.resolve( file ) ] ); | ||
changedPendingPurge.forEach( file => delete require.cache[ path.resolve( file ) ] ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think we also have to prune its parent modules reference. I know that isn't related to this PR, I'll try to make another issue and / or PR for it specifically.
When using a glob or directory argument, we currently recursively traverse the entire working directory (except for
.git
andnode_modules
) and then filter out results later with Minimatch. This is inefficient to say the least, and is especially noticable on large repositories.I looked into a bunch of glob libraries but couldn't find any that uphold our security and simplicity principles, in so far that most of them bring in far too many dependencies that I can't feasibly audit or keep up with and (unfortunately) most likely the library author hasn't audited or trusts deeply either to e.g. not get compromised or taken over in subtle ways.
Except for one,
tiny-glob
, this seems to fit the bill nicely and would be a great library to boost through the QUnit community. I instrumented it with some checks and confirmed that it isn't just very fast (which some globbers achieve by spawning processes rather than being efficient), but indeed it is also efficient in not traversing directories that it knows cannot match any part of the pattern. e.g.src/
will be entered for**/*.js
but will not be entered fortest/**/*.js
.@SparshithNR @rwjblue I tried this out on a few small code basis and an artificially bloated directory tree and it seems to work fine. I'd be great if you could give this a try and let me know if you find any issues and/or could measure improvements. Thanks!