Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove RepoSpec manipulation of .git suffix #4985

Merged
merged 1 commit into from
Jan 20, 2023

Conversation

KnVerey
Copy link
Contributor

@KnVerey KnVerey commented Jan 18, 2023

TL;DR the .git suffix is not a special segment; it is a very, very common convention for the directory name in "bare" repos, but not one that is required: https://git-scm.com/book/en/v2/Git-on-the-Server-Getting-Git-on-a-Server.

By convention, bare repository directory names end with the suffix .git

After much investigation, I finally thought to just look in the git code itself, and I found this commit: git/git@b3256eb. I'm not familiar with C, but the code seems to be here, and that message seems pretty clear that the git client itself will try both repo and repo.git (among other things). Since we're shelling out to that client, it should automatically do the suffixing for us when needed.

Before I found this, I also did some manual tests, which are still summarized below. I want to be very confident in our understanding here, since if it is wrong, the change could be breaking for some users.

Closes #4866

Change summary

Category Before After
Github suffix forcibly added both formats accepted and valid
Bitbucket, Gitlab suffix forcibly added both formats accepted and valid
AWS and Azure suffix forcibly omitted both formats accepted, but suffix is invalid
Other suffix forcibly added both formats accepted, validity unknowable

Potential bugs fixed:

  • Arbitrary git servers that do not follow the .git directory naming convention can now be used with Kustomize, as we no longer force the (potentially invalid) suffix in that case.
  • Origin annotations record the clone spec directly, so .git incorrectly appeared in these annotations in some cases.

Potential regressions:

  • If someone previously included .git in an AWS or Azure URL, we will no longer automatically remove that invalid suffix (and neither will git--it only tries adding things). Realistically, I'm not sure why someone would have ever done this? Surely those services would not give out such invalid URLs.

Background

With the current code, the x.GitSuffix field is always added to all non-local RepoSpecs, whether or not the input contained this suffix, and then it is conditionally stripped for AWS and Azure specifically when we construct the clone spec. This behaviour comes from #612. So, it would seem we started out always adding the .git if absent for Github, which makes the URLs we use match what you'll get from the UI. The special casing was added to exempt these two services from that behaviour.

A potential problem I see with this behaviour is with hosting services unknown to Kustomize. Currently, if you have a custom host and do NOT follow the .git directory naming convention, there is no way for you to give Kustomize a URL that will work.

Moreover, this may be a case of Kustomize interfering with the user input when it could be falling back on what git itself supports. I first discovered that Github URLs work with or without the suffix but could not find documentation as to whether it was the git client doing this, a git option on the server side that many hosting services enable, or something bespoke many hosting services do. Eventually, I found the commit I referenced at the beginning, leading me to believe it is the client.

Test 1: bare repo with file protocol

Based on https://git-scm.com/book/en/v2/Git-on-the-Server-Getting-Git-on-a-Server.

  1. Clone the repo of your choice (e.g. git clone https://github.com/KnVerey/issue-forms-playground)
  2. Go to some other directory and create a bare clone using the dirname convention (e.g. git clone --bare ~/src/github.com/KnVerey/issue-forms-playground ~/scratch/issue-forms-playground.git). Important: they must not end up siblings or the test is invalid!
  3. Attempt to clone the bare repo using the file protocol and the real suffix (e.g. git clone --bare file://$HOME/scratch/issue-forms-playground.git some-test-clone): WORKS!
  4. Attempt to clone the bare repo using the file protocol, omitting the suffix (e.g. git clone --bare file://$HOME/scratch/issue-forms-playground test-without-suffix): ALSO WORKS! 🎉

Based on this, I already suspected it is the git client doing this handling.

Test 2: try various URLs with popular git servers

URL works with .git works without .git
https://github.com/KnVerey/k8s-workflow-utils(.git)
ssh://[email protected]/KnVerey/k8s-workflow-utils(.git)
[email protected]:KnVerey/k8s-workflow-utils(.git)
https://gitlab.com/gitlab-tests/sample-project(.git)
ssh://[email protected]/gitlab-tests/sample-project(.git)
[email protected]:gitlab-tests/sample-project(.git)
https://bitbucket.org/teamsinspace/documentation-tests(.git)
https://[email protected]/knverey/azure-test/_git/azure-test(.git) 𐄂

(I was not able to do a real test for AWS, so I'm assuming it behaves like Azure, as our code expects.)

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 18, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KnVerey

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 18, 2023
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 18, 2023
@KnVerey KnVerey force-pushed the repospec-git-suffix branch from 7cbe735 to da75f45 Compare January 18, 2023 03:57
@KnVerey KnVerey added the release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. label Jan 18, 2023
@@ -171,13 +171,16 @@ func locRootPath(rootURL, repoDir string, root filesys.ConfirmedDir, fSys filesy
if err != nil {
log.Panicf("cannot find path from %q to child directory %q: %s", repo, root, err)
}
// the git-server-side directory name conventionally (but not universally) ends in .git, which
// is conventionally stripped from the client-side directory name used for the clone.
localRepoPath := strings.TrimSuffix(repoSpec.RepoPath, ".git")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@annasong20 lmk what you think of this adjustment to localize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thank you!

@@ -815,7 +815,7 @@ buildMetadata: [originAnnotations]
metadata:
annotations:
config.kubernetes.io/origin: |
repo: https://github.com/kubernetes-sigs/kustomize.git
repo: https://github.com/kubernetes-sigs/kustomize
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natasha41575 do you think this is acceptable? I think we at least need a release note. If it isn't acceptable, we could create a legacy method just for the origin annotations feature that retains the old behaviour (but this may be confusing if people are using it for debugging?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not only acceptable but could even be considered a bug fix. The original URL didn't have a .git, so I think it's a good thing if we stop forcibly adding it to the annotations here!

@KnVerey KnVerey force-pushed the repospec-git-suffix branch from da75f45 to 7e000aa Compare January 18, 2023 04:05
@KnVerey KnVerey requested a review from natasha41575 January 18, 2023 04:23
Copy link
Contributor

@annasong20 annasong20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold for @natasha41575's feedback
/lgtm The code looks so clean! My nits aren't mandatory.

@@ -171,13 +171,16 @@ func locRootPath(rootURL, repoDir string, root filesys.ConfirmedDir, fSys filesy
if err != nil {
log.Panicf("cannot find path from %q to child directory %q: %s", repo, root, err)
}
// the git-server-side directory name conventionally (but not universally) ends in .git, which
// is conventionally stripped from the client-side directory name used for the clone.
localRepoPath := strings.TrimSuffix(repoSpec.RepoPath, ".git")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thank you!

@@ -46,9 +46,6 @@ type RepoSpec struct {
// Branch or tag reference.
Ref string

// e.g. .git or empty in case of _git is present
GitSuffix string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should we check that the .git delimiter is followed by a / or the end of the string in tryExplicitMarkerSplit: https://github.com/kubernetes-sigs/kustomize/pull/4985/files#diff-109bb418d3031aaee374127b42b05a6845a4fb528a637c734a84270be49d8c8fL203-L207?

Apologies that the placement of this comment is nowhere near the relevant source code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good thought, and if we ever get a report of a problem with this in practice, we should definitely do it. That said, it is an edge case that Kustomize has never handled previously, and that I expect is pretty unlikely, so I'd like to avoid the complexity it would introduce for now.

@@ -46,9 +46,6 @@ type RepoSpec struct {
// Branch or tag reference.
Ref string

// e.g. .git or empty in case of _git is present
GitSuffix string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: When we parse url path segments (rather than file path segments): https://github.com/kubernetes-sigs/kustomize/pull/4985/files#diff-109bb418d3031aaee374127b42b05a6845a4fb528a637c734a84270be49d8c8fR191-R192,
should we use "/" instead of filepath.Separator, which can be \?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great catch! I will submit a follow-up with this fix.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 20, 2023
@annasong20
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 20, 2023
Copy link
Contributor

@natasha41575 natasha41575 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold cancel

If someone previously included .git in an AWS or Azure URL, we will no longer automatically remove that invalid suffix (and neither will git--it only tries adding things). Realistically, I'm not sure why someone would have ever done this? Surely those services would not give out such invalid URLs.

I don't know why anyone would do this, but I think it would be unreasonable to expect a tool like kustomize to support invalid URLs, even if said URLs worked in the past.

Origin annotations record the clone spec directly, so .git may disappear from these in some cases.

I think the behavior is improved by not adding the .git to the origin annotations.

@@ -815,7 +815,7 @@ buildMetadata: [originAnnotations]
metadata:
annotations:
config.kubernetes.io/origin: |
repo: https://github.com/kubernetes-sigs/kustomize.git
repo: https://github.com/kubernetes-sigs/kustomize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not only acceptable but could even be considered a bug fix. The original URL didn't have a .git, so I think it's a good thing if we stop forcibly adding it to the annotations here!

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 20, 2023
@k8s-ci-robot k8s-ci-robot merged commit 0a1aa7c into kubernetes-sigs:master Jan 20, 2023
@KnVerey KnVerey changed the title Proposal: Remove RepoSpec manipulation of .git suffix Remove RepoSpec manipulation of .git suffix Jan 24, 2023
@KnVerey KnVerey mentioned this pull request Jan 24, 2023
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Apr 15, 2023
v1.27.1

No CLI change.

v1.27.0

API Change

* Adds feature gate NodeLogQuery which provides cluster administrators with
a streaming view of logs using kubectl without them having to implement a
client side reader or logging into the node.

Feature

* Added "general", "baseline", and "restricted" debugging profiles for
kubectl debug.
* Added "netadmin" debugging profiles for kubectl debug.
* Added --output plaintext-openapiv2 argument to kubectl explain to use old
openapiv2 explain implementation.
* Added e2e tests for kubectl --subresource for beta graduation
* Changed kubectl --subresource flag to beta (#116595, @MadhavJivrajani)
* Enable external plugins can be used as subcommands for kubectl create
command if subcommand does not exist as builtin only when
KUBECTL_ENABLE_CMD_SHADOW environment variable is exported.
* Kubectl will now display SeccompProfile for pods, containers and
ephemeral containers, if values were set.
* Kubectl: added e2e test for default container annotation
* Made kubectl-convert binary linking static (also affects the deb and rpm
packages).
* Promoted whoami kubectl command.
* Since Kubernetes v1.5, kubectl apply has had an alpha-stage --prune flag
to support deleting previously applied objects that have been removed from
the input manifest. This feature has remained in alpha ever since due to
performance and correctness issues inherent in its design. This PR exposes
a second, independent pruning alpha powered by a new standard named
ApplySets. An ApplySet is a server-side object (by default, a Secret;
ConfigMaps are also allowed) that kubectl can use to accurately and
efficiently track set membership across apply operations. The format used
for ApplySet is set out in KEP 3659 as a low-level specification. Other
tools in the ecosystem can also build on this specification for improved
interoperability. To try the ApplySet-based pruning alpha, set
KUBECTL_APPLYSET=true and use the flags --prune --applyset=secret-name with
kubectl apply.
* Switched kubectl explain to use OpenAPIV3 information published by the
server. OpenAPIV2 backend can still be used with the --output
plaintext-openapiv2 argument
* Upgrades functionality of kubectl kustomize as described at
https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv5.0.0
and
https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv5.0.1.
This is a new major release of kustomize, so there are a few
backwards-incompatible changes, most of which are rare use cases, bug fixes
with side effects, or things that have been deprecated for multiple
releases already:

 - kubernetes-sigs/kustomize#4911: Drop support for a very old, legacy
 style of patches. patches used to be allowed to be used as an alias for
 patchesStrategicMerge in kustomize v3. You now have to use
 patchesStrategicMerge explicitly, or update to the new syntax supported by
 patches. See examples in the PR description of
 kubernetes-sigs/kustomize#4911.
 - kubernetes-sigs/kustomize#4731: Remove a potential build-time
 side-effect in ConfigMapGenerator and SecretGenerator, which loaded values
 from the local environment under some circumstances, breaking kustomize
 build's side-effect-free promise. While this behavior was never intended,
 we deprecated it and are announcing it as a breaking change since it
 existed for a long time. See also the Eschewed Features documentation.
 - kubernetes-sigs/kustomize#4985: If you previously included .git in an
 AWS or Azure URL, we will no longer automatically remove that suffix. You
 may need to add an extra / to replace the .git for the URL to properly
 resolve.
 - kubernetes-sigs/kustomize#4954: Drop support for using gh: as a
 host (e.g. gh:kubernetes-sigs/kustomize). We were unable to find any usage
 of or basis for this and believe it may have been targeting a custom
 gitconfig shorthand syntax.

* [alpha: kubectl apply --prune --applyset] Enabled certain custom
resources (CRs) to be used as ApplySet parent objects. To enable this for a
given CR, apply the label applyset.kubernetes.io/is-parent-type: true to
the CustomResourceDefinition (CRD) that defines it.
* kubectl now uses HorizontalPodAutoscaler v2 by default.

Documentation

* kubectl create rolebinding -h

Bug or Regression

* Added (dry run) and (server dry run) suffixes to kubectl scale command
when dry-run is passed
* Changed the error message of kubectl rollout restart when subsequent
kubectl rollout restart commands are executed within a second
* Changed the error message to cannot exec into multiple objects at a time
when file passed to kubectl exec contains multiple resources
* Kubectl: enabled usage of label selector for filtering out resources when
pruning for kubectl diff
* kubectl port-forward now exits with exit code 1 when remote connection is
lost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor repospec code to be more readable
4 participants