Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git extension improvements #3502

Conversation

DefaultRyan
Copy link
Member

Summary of the pull request

This brings several improvements to the Git extension:

  • Huge performance increase for fetching latest commit
  • Improved file/repo status
  • Improved thread safety

Detailed description of the pull request / Additional comments

For the perf increase, I learned that LibGit2Sharp.CommitEnumerator is not reusable, even though the object it wraps, libgit2's git_revwalk is documented that "for maximum performance, this revision walker should be reused for different walks". Measurements confirmed that most of the time was being spent in libgit2's prepare_walk() where it traverses, sorts, and caches the commit graph in the git_revwalk. After this step, walking the commits in the revwalk proceed quickly, which is used for finding a file's latest commit.

To work around this, we grab the whole commit graph up front and cache the list of LibGit2Sharp.Commit objects to walk later. When it's time to fetch this cached list, we first check to see if HEAD has changed, and retrieve a new list of commits.

  • An interesting data point is that fetching the entire commit history appears wasteful, but this is offset by only needing to do it once, and it's far less complex to grab it in one go rather than try to do it incrementally behind some thread locks. Additionally, while many folders in a repo will consist of relatively "recent" files, a small sampling of repos shows that the root folder often contains files that are quite old - license, .gitignore, .gitattribute, and other configuration and boilerplate tend to change infrequently. Since these are in the root, and the root folder is often the starting point for browsing the file tree, we'd frequently have to fetch a significant portion of the commit in the beginning, eroding the potential savings of an incremental approach.

For improved file status, instead of calling ToString on LibGit2Sharp's enum values, we now simplify and turn these strings into more intuitive values, like "Modified", "Staged", "Untracked".

For improved repo status, we now obtain the whole-repo status and condense that into a compact string representation we see in other tools.
<branch name> <branch status> | <file counts>
Where:

  • <branch name> is either "Branch: <name>" or "Detached: <sha>"
  • <branch status> if the branch is tracking a remote shows either
    • ↑ <ahead # commits>
    • ↓ <behind # commits>
    • ↓ <behind # commits> ↑ <ahead # commits>
    • (when up-to-date)
  • <file counts> is "+<#added> ~<#numstaged> -<#removed> | <#untracked> ~<modified> -<missing>"
    • If merge conflicts are present it appends "!<#conflicted>"

To fix race conditions, I moved the LibGit2Sharp.Repository object into a new RepositoryWrapper class that will manage access to the Repository objects to eliminate concurrent access.

Validation steps performed

Built extension and browsed the PowerToys repo, refreshing the view while checking out branches back and forth to cause contention on fetching status and commits. Not only does this run much faster than before, but the infrequent instances where the extension would sometimes hang or crash have disappeared.

PR checklist

  • Closes #xxx
  • Tests added/passed
  • Documentation updated

@DefaultRyan DefaultRyan merged commit a27d47a into feature/fileexplorer-sourcecontrol-integration Jul 26, 2024
4 checks passed
@DefaultRyan DefaultRyan deleted the user/defaultryan/git-improvements branch July 29, 2024 20:16
ssparach pushed a commit that referenced this pull request Jul 31, 2024
* WARP SPEED COMMIT HISTORY. Rooted out and worked around a major bottleneck inside LibGit2Sharp

* Better status

* Introduce RepositoryWrapper to lock access to Repository

* Update unit test to reflect new repo status

* Check cache before calling Repository.IsValid
ssparach pushed a commit that referenced this pull request Jul 31, 2024
* WARP SPEED COMMIT HISTORY. Rooted out and worked around a major bottleneck inside LibGit2Sharp

* Better status

* Introduce RepositoryWrapper to lock access to Repository

* Update unit test to reflect new repo status

* Check cache before calling Repository.IsValid
ssparach pushed a commit that referenced this pull request Jul 31, 2024
* WARP SPEED COMMIT HISTORY. Rooted out and worked around a major bottleneck inside LibGit2Sharp

* Better status

* Introduce RepositoryWrapper to lock access to Repository

* Update unit test to reflect new repo status

* Check cache before calling Repository.IsValid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants