-
Notifications
You must be signed in to change notification settings - Fork 83
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduce memory and CPU use when scanning (#222)
Rework git metadata calculation to reduce peak memory and wall clock time when scanning. This includes many changes. The net effect of all this is a typical 30% speedup and 50% memory reduction when scanning Git repositories; in pathological cases, up to a 5x speedup and 20x memory reduction. - Git metadata graph: - Do not construct in-memory graph of all trees and blob names; instead, read tree objects from repo as late as possible - Use `SmallVec` to reduce heap fragmentation and small heap allocations - Use more suitable initial size for worklists and scratch buffers to reduce reallocations and heap fragmentation - Use the fastest / slimmest order for iterating object headers from a git repository when initially counting objects - Eliminate redundant intermediate data structures; remove unused fields from remaining intermediate data structures - Avoid temporary allocations when concatenating blob filenames - Fix a longstanding bug where a blob introduced multiple times within a single commit would have only a single arbitrary pathname reported - `BStringTable`: - change default initialization to create an empty table - change `get_or_intern` to avoid heap-allocated temporaries when an entry already exists - Scanning: - Use an `Arc<CommitMetadata>` instead of `CommitMetadata` and `Arc<PathBuf>` instead of `PathBuf` within git blob provenance entries (allows sharing; sometimes reduces memory use of these object types 10,000x) - Use a higher default level of parallelism: require 3GiB RAM instead of 4GiB per parallel job - Open Git repositories a single time instead of twice - Fix clippy nits
- Loading branch information
1 parent
0518d80
commit cd6c187
Showing
12 changed files
with
622 additions
and
554 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.