Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git Tab Completion Slow on Larger repositories for certain commands #1533

Closed
1 task done
nwhansen opened this issue Mar 2, 2018 · 15 comments
Closed
1 task done

Git Tab Completion Slow on Larger repositories for certain commands #1533

nwhansen opened this issue Mar 2, 2018 · 15 comments
Milestone

Comments

@nwhansen
Copy link

nwhansen commented Mar 2, 2018

  • I was not able to find an open or closed issue matching what I'm seeing

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
64 bit installer
git version 2.16.1.windows.4
cpu: x86_64
built from commit: ef6d451bbfef86a529ebf12620289e0f15a93d8e
sizeof-long: 4

  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
    Windows 10
Microsoft Windows [Version 10.0.15063]
(c) 2017 Microsoft Corporation. All rights reserved.
  • What options did you set as part of the installation? Or did you choose the
    defaults?
Editor Option: VIM
Path Option: Cmd
SSH Option: OpenSSH
CURL Option: OpenSSL
CRLF Option: CRLFCommitAsIs
Bash Terminal Option: MinTTY
Performance Tweaks FSCache: Enabled
Use Credential Manager: Enabled
Enable Symlinks: Disabled
  • Any other interesting things about your environment that might be related
    to the issue you're seeing?

This repository is very large. 5gb in size however this is replicatable with linux kernal repository. I am not using a solid state for the repository.

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

Git Bash

git status -- ar[TAB]

Or

git mv ar[TAB]
  • What did you expect to occur after running these commands?
    Tab completed to
git status -- arch/
  • What actually happened instead?
    After approximately 10-20 seconds it completes

  • If the problem was occurring with a specific repository, can you provide the
    URL to that repository to help us with testing?
    Reproduced using: https://github.com/torvalds/linux


h3. Additional investigation:
Enabling

GIT_TRACE_PERFORMANCE=<logfile>

Creates the following log entries

13:59:59.608590 trace.c:417             performance: 0.002119112 s: git command: 'C:\Program Files\Git\mingw64\bin\git.exe' 'rev-parse' '--git-dir' '--is-inside-git-dir' '--is-bare-repository' '--is-inside-work-tree' '--short' 'HEAD'
14:00:16.755165 trace.c:417             performance: 0.001872594 s: git command: 'C:\Program Files\Git\mingw64\bin\git.exe' 'config' 'status.showUntrackedFiles'
14:00:28.706112 trace.c:417             performance: 11.796146578 s: git command: 'C:\Program Files\Git\mingw64\bin\git.exe' '-C' '.' 'ls-files' '--exclude-standard' '--cached' '--directory' '--no-empty-directory' '--others'

Running the command

"C:\Program Files\Git\mingw64\bin\git.exe" -C ./ ls-files --exclude-standard --cached --directory --no-empty-directory --others

Returns 62913 lines. Attached as ls-files-result.txt
ls-files-result.txt

Git status on freshly checkout respository is the file checkoutStatus.txt
checkoutStatus.txt

Some further investigation shows that running set -x to gain some insight into the completion script shows that we spend a lot of time opening the completion script with

++case "$file" in
++echo drivers
++read -r file

being repeated what seems thousands of times.

@drizzd
Copy link

drizzd commented Mar 10, 2018

I can confirm your observation on Windows 10 64-bit with Msys2 mingw64 git version 2.15.1.windows.2:

~/src/linux$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m11.876s
user    0m4.685s
sys     0m6.808s

Most of the time is spent in the while read loop in __git_index_files. We can improve much by replacing the loop with an sed script:

~/src/linux$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time
_git

real    0m1.372s
user    0m0.263s
sys     0m0.167s

Here's the change:

diff --git a/.git-completion.bash b/.git-completion-mod.bash
index a54cc2c..f20945f 100644
--- a/.git-completion.bash
+++ b/.git-completion-mod.bash
@@ -351,12 +351,7 @@ __git_index_files ()
        local root="${2-.}" file

        __git_ls_files_helper "$root" "$1" |
-       while read -r file; do
-               case "$file" in
-               ?*/*) echo "${file%%/*}" ;;
-               *) echo "$file" ;;
-               esac
-       done | sort | uniq
+       sed -e '/^\//! s#/.*##' | sort | uniq
 }

 # Lists branches from the local repository.

I cannot reproduce the issue in WSL. Maybe the mingw64 version of bash has a performance issue with while read loops.

@drizzd
Copy link

drizzd commented Mar 11, 2018

I have prepared a branch with the changes here https://github.com/drizzd/git/tree/completion-optimize-ls-files-filter. Still need to setup an email client to contribute this to the git mailing list.

@dscho
Copy link
Member

dscho commented Mar 13, 2018

@drizzd if you want, I can contribute this for you. Or you can follow @derrickstolee's excellent guide and set up git send-email: https://github.com/git-for-windows/git/blob/master/CONTRIBUTING.md#submit-your-patch

@dscho
Copy link
Member

dscho commented Mar 13, 2018

Maybe the mingw64 version of bash has a performance issue with while read loops.

FWIW this is very likely...

@nwhansen
Copy link
Author

nwhansen commented Apr 3, 2018

I modified the git completion script to include that change and the performance difference is quite significant. I will continue to test out the change and and report further findings.

@drizzd
Copy link

drizzd commented Apr 4, 2018

Thanks. Please note that the patch has been discussed on the mailing list: https://public-inbox.org/git/[email protected]/

The patch removes a sorting operation, which is wrong. I have a prepared a patch with the fix here:
https://github.com/drizzd/git/commits/completion-optimize-ls-files-filter4.

@dscho
Copy link
Member

dscho commented Apr 4, 2018

@drizzd do you think the updated patch is ready to be cherry-picked into Git for Windows?

@nwhansen
Copy link
Author

nwhansen commented Apr 4, 2018

My apologies I am not currently following that mailing list. I appreciate the effort and time looking into this bug report :)

@dscho
Copy link
Member

dscho commented Apr 4, 2018

My apologies I am not currently following that mailing list.

Only start following it if you like drinking from a fire hose...

@drizzd
Copy link

drizzd commented Apr 4, 2018

@dscho It's working fine for me on Windows. But who knows if there are more comments on the list. I can make a note for you here when the patch is accepted.

gitster pushed a commit to git/git that referenced this issue Apr 10, 2018
From the output of ls-files, we remove all but the leftmost path
component and then we eliminate duplicates. We do this in a while loop,
which is a performance bottleneck when the number of iterations is large
(e.g. for 60000 files in linux.git).

$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m11.876s
user    0m4.685s
sys     0m6.808s

Replacing the loop with the cut command improves performance
significantly:

$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m1.372s
user    0m0.263s
sys     0m0.167s

The measurements were done with Msys2 bash, which is used by Git for
Windows.

When filtering the ls-files output we take care not to touch absolute
paths. This is redundant, because ls-files will never output absolute
paths. Remove the unnecessary operations.

The issue was reported here:
git-for-windows#1533

Signed-off-by: Clemens Buchacher <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
@dscho dscho added this to the v2.17.0(2) milestone Apr 25, 2018
@dscho dscho closed this as completed in 8dd7bac Apr 25, 2018
@dscho
Copy link
Member

dscho commented Apr 25, 2018

A new snapshot is building right now and should become available in about half an hour: https://wingit.blob.core.windows.net/files/index.html

@dscho
Copy link
Member

dscho commented Apr 25, 2018

(Please test, the easiest way is to install the portable Git, start its git-bash.exe and verify that the tab completion is faster in your repository.)

dscho added a commit to git-for-windows/build-extra that referenced this issue Apr 25, 2018
Tab completion of `git status -- <partial-path>` [is now a lot
faster](git-for-windows/git#1533).

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho
Copy link
Member

dscho commented Apr 28, 2018

Please test

Can y'all please test?

@nwhansen
Copy link
Author

It is working so far. The performance difference is outstanding, the native commands are faster but it is much harder to tell the difference. and they are well within a second of each other.

The double tab list is in the same order so I don't know if there is anything else I can really test. Thanks so much for getting a fix for this @drizzd @dscho

@dscho
Copy link
Member

dscho commented Apr 29, 2018

The performance difference is outstanding

That is really what I was hoping.

The double tab list is in the same order

Very good. Thank you for verifying, this is really crucial information to have confidence in this improvement.

dscho pushed a commit that referenced this issue May 29, 2018
From the output of ls-files, we remove all but the leftmost path
component and then we eliminate duplicates. We do this in a while loop,
which is a performance bottleneck when the number of iterations is large
(e.g. for 60000 files in linux.git).

$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m11.876s
user    0m4.685s
sys     0m6.808s

Replacing the loop with the cut command improves performance
significantly:

$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m1.372s
user    0m0.263s
sys     0m0.167s

The measurements were done with Msys2 bash, which is used by Git for
Windows.

When filtering the ls-files output we take care not to touch absolute
paths. This is redundant, because ls-files will never output absolute
paths. Remove the unnecessary operations.

The issue was reported here:
#1533

Signed-off-by: Clemens Buchacher <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
dscho added a commit that referenced this issue May 29, 2018
Shell completion (in contrib) that gives list of paths have been
optimized somewhat.

* cb/bash-completion-ls-files-processing:
  completion: improve ls-files filter performance

This fixes #1533

Signed-off-by: Johannes Schindelin <[email protected]>
dscho pushed a commit that referenced this issue May 29, 2018
From the output of ls-files, we remove all but the leftmost path
component and then we eliminate duplicates. We do this in a while loop,
which is a performance bottleneck when the number of iterations is large
(e.g. for 60000 files in linux.git).

$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m11.876s
user    0m4.685s
sys     0m6.808s

Replacing the loop with the cut command improves performance
significantly:

$ COMP_WORDS=(git status -- ar) COMP_CWORD=3; time _git

real    0m1.372s
user    0m0.263s
sys     0m0.167s

The measurements were done with Msys2 bash, which is used by Git for
Windows.

When filtering the ls-files output we take care not to touch absolute
paths. This is redundant, because ls-files will never output absolute
paths. Remove the unnecessary operations.

The issue was reported here:
#1533

Signed-off-by: Clemens Buchacher <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
dscho added a commit that referenced this issue May 29, 2018
Shell completion (in contrib) that gives list of paths have been
optimized somewhat.

* cb/bash-completion-ls-files-processing:
  completion: improve ls-files filter performance

This fixes #1533

Signed-off-by: Johannes Schindelin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants