-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid traversing NTFS junction points in git clean -dfx
#1976
Conversation
It seems to be not exactly rare on Windows to install NTFS junction points (the equivalent of "bind mounts" on Linux/Unix) in worktrees, e.g. to map some development tools into a subdirectory. In such a scenario, it is pretty horrible if `git clean -dfx` traverses into the mapped directory and starts to "clean up". Let's just not do that. Let's make sure before we traverse into a directory that it is not a mount point (or junction). This addresses git-for-windows#607 Signed-off-by: Johannes Schindelin <[email protected]>
Windows' equivalent to "bind mounts", NTFS junction points, can be unlinked without affecting the mount target. This is clearly what users expect to happen when they call `git clean -dfx` in a worktree that contains NTFS junction points: the junction should be removed, and the target directory of said junction should be left alone (unless it is inside the worktree). Signed-off-by: Johannes Schindelin <[email protected]>
We will use this in the next commit to implement an FSCache-aware version of is_mount_point(). Signed-off-by: Johannes Schindelin <[email protected]>
When FSCache is active, we can cache the reparse tag and use it directly to determine whether a path refers to an NTFS junction, without any additional, costly I/O. Note: this change only makes a difference with the next commit, which will make use of the FSCache in `git clean` (contingent on `core.fscache` set, of course). Signed-off-by: Johannes Schindelin <[email protected]>
The `git clean` command needs to enumerate plenty of files and directories, and can therefore benefit from the FSCache. Signed-off-by: Johannes Schindelin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives me pause:
/*
* we might have removed this as part of earlier
* recursive directory removal, so lstat() here could
* fail with ENOENT.
*/
if (lstat(abs_path.buf, &st))
continue;
With fscache enabled, if we've already cached the result then this lstat() will return success even if the file has been "removed this as part of earlier recursive directory removal." If I'm reading the code right, that means we could end up trying to delete the file multiple times. This is after the interactive_main_loop() so I don't think we'd prompt the user for already deleted files but it looks like that could generate warnings in remove_dirs().
901b11b
to
e004e1b
Compare
Hmm. That's a good point. We use Let me think whether we can avoid that somehow. |
} | ||
current_dev = st.st_dev; | ||
|
||
/* Now look at the current directoru */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Now look at the current directoru */ | |
/* Now look at the parent directory */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it really is the current directory, not the parent directory, that we're looking at... but of course, it is a directory, not a directoru... 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code obscures this a bit, but I am positive that this is the parent directory. We append '/.' to path above, then we append '.' here, which makes '/..'.
If the '/.' above is not needed, I suggest removing it. Maybe a 'dirname' method (which strips the tail component from a path which must not end on '/..') would make this more readable.
strbuf_addstr(path, "/."); | ||
if (lstat(path->buf, &st)) { | ||
/* | ||
* If we cannot access the current directory, we cannot say |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we cannot access the current directory, and return false here, will we then proceed to clean files in this directory even though it might actually be a mount point? If so, maybe we should act on the side of caution and check is_definitely_not_a_mount_point
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for Linux mount points, right? Let's ignore this is a corner case.
@drizzd since you offered to help... After @benpeart 's comments, I think the only sensible way forward is to teach FSCache a way to invalidate entries, and to do exactly that in From my perspective, the trickiest thing here is to get the invalidation right in light of multi-threading. I.e. I would want to have placeholder entries for the invalidated entries that would trigger a re-read, but that would have to be done in a way that That's what is holding up this Pull Request. |
Co-Authored-By: dscho <[email protected]>
I did some digging regarding the lstat check. I think it is redundant today. The check and the comment were added in d871c86 (contained in Git 1.5.4-rc0). At the time, the loop iterated directly over the directory entries found by read_directory. For example, it would loop over In 6b1db43 (contained in Git 2.13.2), a new method correct_untracked_entries was added which removes entries contained in other entries. For example, "a/b" is removed from |
This comment has been minimized.
This comment has been minimized.
Just for the record, directory symlinks created with |
I created a new PR #2268 since I was unable to push to dscho/dont-clean-junctions. Maybe one of the prerequisites are not met. From https://help.github.com/en/articles/committing-changes-to-a-pull-request-branch-created-from-a-fork:
|
Given how much discussion symlinks have had, is there a particular place (e.g. wiki https://github.com/git-for-windows/git/wiki/Symbolic-Links - last update 1yr ago) folk should be directed to that will record the latest clarifications, implications, and internal implementations? e.g. a collation of the various comments now that we have a cleaner PR |
Closing in favor of #2268 |
This addresses #607.