Skip to content

Commit

Permalink
Simplify glob matching and directly match in-memory globs as Patterns. (
Browse files Browse the repository at this point in the history
#7402)

### Problem

#7299 aligned the behaviour of in-memory (eager) and on-disk (lazy) glob matching, but unfortunately caused a serious performance regression in detection of file owners.

The on-disk (lazy) implementation of glob matching from `GlobMatchingImplementation::expand` was imposing about a 30x overhead when matching globs in memory... and unfortunately, we do that more often than I had realized.

### Solution

In two independently reviewable commits:
1. Simplify `GlobMatchingImplementation::expand` based on the realization that the `PathGlob` caching that we had been doing was a holdout from when we used to memoize them in the graph, and which would never hit due to the unique trailing components of each `PathGlob`.
2. Introduce and use `PathGlobs::matches(Vec<PathBuf>) -> bool` to replace `MemFS::expand` with an eager operation. Expand `FilespecTest` to confirm that this redundant implementation does not go out of sync.

### Result

`GlobMatchingImplementation::expand` is ~3x faster (representing a 20% speedup for `BuildGraph` hydration in a large repository), and owners detection for a command like:
```
time ./pants --changed-diffspec='2c9c338cd7^..2c9c338' list
```
... runs at approximately the speed that it did before #7299.
  • Loading branch information
Stu Hood authored Mar 20, 2019
1 parent cd8d75a commit 8d64796
Show file tree
Hide file tree
Showing 6 changed files with 329 additions and 398 deletions.
Loading

0 comments on commit 8d64796

Please sign in to comment.