Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: git repository ownership issue #3012

Merged
merged 2 commits into from
Jun 20, 2022
Merged

Conversation

vvagaytsev
Copy link
Collaborator

What this PR does / why we need it:
This is an additional fix for #2905. The same issue happened with some sub-folder of the git repo in the Garden Cloud testing.

The suggested solution is to always consider any working directory in a git repository as a safe one.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

@vvagaytsev vvagaytsev requested review from thsig and Orzelius June 17, 2022 10:06
Orzelius
Orzelius previously approved these changes Jun 17, 2022
thsig
thsig previously requested changes Jun 17, 2022
Copy link
Collaborator

@thsig thsig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

While the logic itself looks fine, we're now performing two extra git CLI calls every time we call vcs.getFiles (which can happen once for each module when a command is first run).

How about reading the contents of the safe.directory block in the git config the first time this method is called, and initialising this.gitSafeDirs using the entries from there?

That should mean that after the path is added (e.g. the first time ensureSafeDirGitRepo is called on the path) we can almost always return semi-instantly from ensureSafeGitDirRepo without having to call git (and we should only have to read the git config once for each CLI invocation—we can track that with a flag, or via some other method).

We should also add some sort of async lock on the file, to make sure it's not being read/written concurrently.

I'm mostly mentioning this because we recently became aware that the time/resource cost of all these git CLI calls really adds up for large projects with lots of modules.

This still raises an error if garden is started outside a git repo
Also used sync lock to prevent concurrency issues with .gitconfig access
@vvagaytsev vvagaytsev force-pushed the fix/git-repo-ownership-issue branch from 34b6a52 to 59b4d21 Compare June 20, 2022 14:48
@eysi09 eysi09 dismissed thsig’s stale review June 20, 2022 18:42

Re-reviewed by me.

@eysi09 eysi09 merged commit 33ab184 into master Jun 20, 2022
@eysi09 eysi09 deleted the fix/git-repo-ownership-issue branch June 20, 2022 18:43
@@ -168,6 +197,8 @@ export class GitHandler extends VcsHandler {
}

await this.ensureSafeDirGitRepo(log, STATIC_DIR)
await this.ensureSafeDirGitRepo(log, path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm late to the party here, but I should note that this will add up to hundreds of git calls for certain large projects. I wonder if this could be done more optimally, but maybe that could be included along with other optimizations.

Copy link
Collaborator Author

@vvagaytsev vvagaytsev Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can do further optimizations later. Now we read all safe directories from the .gitconfig file, this will allow us to avoid extra git status calls for the directories which are explicitly marked as safe ones in the .gitconfig.

But git status will be executed for "normal" repos, i.e. the repos that do not have any ownership issues and should not be marked as safe ones.

We can simply cache the successful git status check results in memory, I did it in #3014

Of course, if there are any repo ownership changes done while a Garden CLI is running, then we might get the issue again. I'm not sure how realistic this scenario is, but restarting the Garden CLI will solve the issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more idea on optimizing this. We can keep the git safe directories in a singleton object. In such case, all GitHandler instances will share the same in-memory cache with known safe directories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants