Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup duplicate code by standardizing on one way - maybe eliminate native helper for go? #4448

Closed
jeffwidman opened this issue Nov 23, 2021 · 5 comments · Fixed by #6723
Closed
Labels
core 🍏 Relates to the dependabot-core library itself L: go:modules Golang modules T: feature-request Requests for new features

Comments

@jeffwidman
Copy link
Member

jeffwidman commented Nov 23, 2021

Now that #4434 landed,
the only reason for the native go helpers is

SharedHelpers.run_helper_subprocess(
command: NativeHelpers.helper_path,
function: "getVcsRemoteForImport",
args: { import: import_path }

However, looking at that file, I noticed that right after that function there's
another variant of git_url_for_path that is implemented in pure ruby so doesn't
require the native go helper:

# Used in dependabot-backend, which doesn't have access to any Go
# helpers.
# TODO: remove the need for this.
def self.git_url_for_path_without_go_helper(path)

Weirdly it says the dependabot backend doesn't have access to go, which
certainly isn't true, otherwise the code removed by #4434 wouldn't have been
added in the first place.

So it'd be best to standardize on one of these two functions and remove the other to reduce confusion.

The tradeoff here appears to be that the pure ruby func would be simpler and allow
completely eliminating the native go helpers, but according to the code comment is less robust. However, the robust comment may be a red-herring, since it is a leftover from the go dep days, before go modules was the "one true way".

I have no opinion on which is the better way to go here, just that having two methods doing
the same thing with misleading code comments is definitely not the right way! 😄

@jurre
Copy link
Member

jurre commented Nov 24, 2021

Yeah, agreed. I think that the native version handles more than just git repo's (mercurial/svn etc) and the ruby version does only git. I am doubtful that Dependabot currently handles non-git sources well, and I wonder if switching to the pure ruby version would break any existing functionality but it's definitely worth a short investigation

@mctofu mctofu added the L: go:modules Golang modules label Nov 24, 2021
@brrygrdn brrygrdn added the core 🍏 Relates to the dependabot-core library itself label Nov 26, 2021
@jeffwidman
Copy link
Member Author

To move this forward, can I just put up a PR switching to the pure ruby version and see if any tests break? Or is this something that needs to be investigated on your side because breakage would most likely be internal/invisible to external contributors?

@jeffwidman
Copy link
Member Author

jeffwidman commented Aug 12, 2022

Digging into this more, and took me a while to find the historical commits due to multiple file renames/re-orgs:

  1. Initial commit rewriting the Ruby helper in Go: 43be2e1
  2. Follow-on commit adding back the Ruby helper: bd4a9c0

There's no information on what motivated the initial rewrite in Go... this commit may be the root cause, but I'm not really sure:

Go: Use go-get=1 trick to find git source URLs for unrecognised names

as that's where I first see a mention of the code comment:

# TODO: This is not robust! Instead, we should shell out to Go and use
# https://github.com/Masterminds/vcs.

I'm checking internally for more info.

@jeffwidman
Copy link
Member Author

jeffwidman commented Sep 14, 2022

I heard back from one of the original authors:

I think the code is trying to map a Go import path back to a repository so we can e.g. link to the repo in the Dependabot PR, find the CHANGELOG, link to the diff between versions, etc, etc.
It’s not totally trivial with Go because:

  1. they’ve got that whole vanity URL thing going on, where you have to pass go-get=1 in to make it act like a fancy redirect to a repo
  2. a Go import path doesn’t necessarily actually match a repo URL. For instance, I might be able to import https://github.com/go-kit/kit/tracing, but that’s not a valid URL on GitHub. The repo is github.com/go-kit/kit, and the package is browsable at https://github.com/go-kit/kit/tree/master/tracing. So Go bakes in some logic to handle that for a series of SCMs, and the Masterminds/vcs package does a similar thing too.

Ideally, we’d just use whatever golang/go does, but IIRC that wasn’t importable so the next best thing was the Masterminds/vcs implementation. The Ruby code you linked to predates our usage of that Go library, and I’d guess is just a really incomplete Ruby implementation of the same thing.
Perhaps these days a) there’s a good Ruby implementation, or b) we think the VCS list is stable enough we’d be happy to translate it to Ruby, or c) we could somehow codegen the Ruby from the canonical Go implementation? Or we just stick with that one-off Go helper?

@jeffwidman jeffwidman changed the title Cleanup duplicate code by standardizing on one way Cleanup duplicate code by standardizing on one way - maybe eliminate native helper for go? Nov 1, 2022
jeffwidman added a commit to jeffwidman/dependabot-core that referenced this issue Jan 18, 2023
There are two functions that do the same thing, one in `ruby` and one in
`go`.

TODO explain why we had them both and why safe to remove the `go` one
once I finish researching it.

So I removed the one written in `go`.

It turns out this function was the only reason we had native `go`
helpers, so I was able to completely remove the helpers infra altogether.

Fix dependabot#4448
@jeffwidman
Copy link
Member Author

Note to myself: File a ticket with upstream go, asking them to expose the logic that does this so tools like Dependabot can easily repro it.

jeffwidman added a commit that referenced this issue Apr 10, 2023
Over time, we've managed to thin out our custom `go` helper in favor of
native package manager tooling via `go list`, `go get`, and `go mod`.

The final remaining usage was for translating package import URLs into
the actual repository URL for fetching metadata.

The original impetus for the `github.com/Masterminds/vcs` library was
that it was a copy/paste of some code in core `go`:

 [Further context](#4448 (comment)):

> I think the code is trying to map a Go import path back to a repository so we can e.g. link to the repo in the Dependabot PR, find the CHANGELOG, link to the diff between versions, etc, etc.
> It’s not totally trivial with Go because:
> 1. they’ve got that whole vanity URL thing going on, where you have to pass `go-get=1` in to make it act like a fancy redirect to a repo
> 2. a Go import path doesn’t necessarily actually match a repo URL. For instance, I might be able to import https://github.com/go-kit/kit/tracing, but that’s not a valid URL on GitHub. The repo is [github.com/go-kit/kit](http://github.com/go-kit/kit), and the package is browsable at https://github.com/go-kit/kit/tree/master/tracing. So Go bakes in some [logic](https://github.com/golang/go/blob/95c125a44ad1a0c3e441c3214160cd7b4483e79c/src/cmd/go/internal/vcs/vcs.go#L1437) to handle that for a series of SCMs, and the Masterminds/vcs package does a [similar thing](https://github.com/Masterminds/vcs/blob/master/vcs_remote_lookup.go) too.
>
> Ideally, we’d just use whatever golang/go does, but IIRC that wasn’t importable so the next best thing was the Masterminds/vcs implementation. The Ruby code you linked to predates our usage of that Go library, and I’d guess is just a really incomplete Ruby implementation of the same thing.
> Perhaps these days a) there’s a good Ruby implementation, or b) we think the VCS list is stable enough we’d be happy to translate it to Ruby, or c) we could somehow codegen the Ruby from the canonical Go implementation? Or we just stick with that one-off Go helper?

However, I discovered after doing some digging in #6938 that there
_is both_ an upstream copy (golang/go#18387 (comment), which may potentially get deprecated
golang/go#57051) and an actual `go list` command that provides enough of what we need:
* golang/go#44742

```
$ go list -m -f '{{.Origin}}' golang.org/x/tools@latest
{git https://go.googlesource.com/tools    refs/tags/v0.3.0 502c634771c4ba335286d55fc24eeded1704f592 }
```

While this doesn't fully resolve the upstream issue:
* golang/go#18387

For our purposes it provides enough of what we need since we already
have the module path extracted from `go.mod` (golang/go#18387 (comment)).

Performing this switch allows us to completely delete the native go
helper altogether.

I debated leaving the framework in place in case we later wish to extend
it for other features, but decided for now to rip it out as it's easy
enough to add back later. And in general we're trying to move toward
leveraging native package manager tooling whenever possible.
So if we later needed an additional feature, we're probably start with a
discussion with the `go` team to see if they'd build it into the `go`
cmd tooling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core 🍏 Relates to the dependabot-core library itself L: go:modules Golang modules T: feature-request Requests for new features
Projects
None yet
4 participants