Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds go_repository options for using vendored code #601

Closed

Conversation

jmillikin-stripe
Copy link
Contributor

There are two commits here:

First, I've added a --vendor_prefix flag to Gazelle. Setting this flag lets the caller adjust where Gazelle looks for vendored dependencies, for cases where the repository owner didn't use the default vendor/ location.

Second, I added attributes to the go_repository() rule to let Gazelle use vendored dependencies in external repositories. I understand this is not quite the preferred approach of having everything in WORKSPACE, but it makes life significantly easier when our build wants to depend on a single go_binary() that itself requires ~dozens of vendored dependencies.

The motivating use case for us is bundling a compiled-from-Git etcd (https://github.com/coreos/etcd) into some Docker images built with Bazel's docker rules. With these changes, it's possible to depend on the etcd binary from our build without having to maintain a permanent "etcd plus BUILD rules" fork.

This lets BUILD files be generated for projects that use vendored
dependencies with a non-standard location, such as etcd.
These attributes can be used to enable building a depended-up repository
using its own vendored dependencies, instead of workspace dependencies.
@bazel-io
Copy link

bazel-io commented Jul 3, 2017

Can one of the admins verify this patch?

@jmillikin-stripe
Copy link
Contributor Author

Related issues:

@jayconrod
Copy link
Contributor

Could you explain more about how this would be used for etcd? I see there is a cmd/vendor directory. If I understand correctly, you'd like to be able to add etcd as an external repository in some other workspace, and you'd like the generated BUILD files to respect vendor directories that aren't at the top level.

Rather than adding a new command line argument to Gazelle, I think it would be better for Gazelle to detect and use vendor directories at any place in the tree automatically (#486). I don't think we should support vendoring in directories named something other than vendor though. We should also be careful not to let vendored packages be visible outside of the subtree that contains vendor. Basically, we want Gazelle to generate BUILD files that match "go build" conventions.

The main deficiency of Gazelle is that it currently does a fairly simple transformation from import paths to labels without actually looking at where depended-upon packages are located or whether they exist. I'm about to start a larger project that should fix that. Improved vendoring will be part of it.

@jmillikin-stripe
Copy link
Contributor Author

If I understand correctly, you'd like to be able to add etcd as an external repository in some other workspace, and you'd like the generated BUILD files to respect vendor directories that aren't at the top level.

That's correct. The only thing our builds depend on from the etcd external repository is the etcd binary.

Rather than adding a new command line argument to Gazelle, I think it would be better for Gazelle to detect and use vendor directories at any place in the tree automatically (#486).

I'm somewhat uncomfortable at the idea of Gazelle trying to auto-detect where the vendor directory is, considering how many ways there are for users to fiddle with $GOPATH. It's preferable to me if Gazelle could be told where to expect things.

I don't think we should support vendoring in directories named something other than vendor though.

Could you expand on that? I don't understand why the Bazel rules should be more restrictive about directory layout than the underlying Go build system. Ideally, Bazel should be able to build any reasonably-shaped Go codebase.

@jayconrod
Copy link
Contributor

I'm somewhat uncomfortable at the idea of Gazelle trying to auto-detect where the vendor directory is, considering how many ways there are for users to fiddle with $GOPATH. It's preferable to me if Gazelle could be told where to expect things.

We'd like Gazelle to be simple enough to be run automatically by editors when a file is saved. Within Google, there's an analogous tool called Glaze which does this very well. In that context, there's no opportunity to pass arguments on the command line. The behavior of Gazelle should be determined by convention whenever possible.

Could you expand on that? I don't understand why the Bazel rules should be more restrictive about directory layout than the underlying Go build system. Ideally, Bazel should be able to build any reasonably-shaped Go codebase.

I just meant that Gazelle shouldn't allow directories with other names (for example, third_party) to be treated as vendor, since that breaks the "go build" convention. Gazelle should support vendor in other locations though. cmd/vendor works fine in "go build" right now, but not in Gazelle, and we should fix that. It requires something more sophisticated than simple import path translation though.

@jmillikin-stripe
Copy link
Contributor Author

We'd like Gazelle to be simple enough to be run automatically by editors when a file is saved. Within Google, there's an analogous tool called Glaze which does this very well. In that context, there's no opportunity to pass arguments on the command line. The behavior of Gazelle should be determined by convention whenever possible.

Do you think it would make sense to have two separate modes for Gazelle? It seems like the job of "generate BUILD files in a non-Bazel source tree" is very different from "refresh existing BUILD files", and it would be a shame if the usefulness of Bazel's Go rules were limited by google3 usability requirements.

Remember that third_party rules assume BUILD files will be submitted alongside the external source, which is not something practical to do when pulling external dependencies from GitHub.

cmd/vendor works fine in "go build" right now, but not in Gazelle, and we should fix that. It requires something more sophisticated than simple import path translation though.

The etcd build scripts run something like ln -s "${PWD}/cmd/vendor "${GOPATH}/src" before invoking go build, I'm not sure it's possible for Gazelle to detect when this sort of thing is happening.

If the concern is about the final directory being named "vendor", would you be OK with a flag like --vendor_prefix=cmd and then always joining vendor to the end of it?

@jayconrod
Copy link
Contributor

Do you think it would make sense to have two separate modes for Gazelle? It seems like the job of "generate BUILD files in a non-Bazel source tree" is very different from "refresh existing BUILD files", and it would be a shame if the usefulness of Bazel's Go rules were limited by google3 usability requirements.

Perhaps, but this would add complexity. This would mean some options are available in one mode but not another. I'd prefer to have very little reliance on the command line.

Remember that third_party rules assume BUILD files will be submitted alongside the external source, which is not something practical to do when pulling external dependencies from GitHub.

This is roughly what we want to do though. In #460, we're planning to remove go_repository. Instead of generating BUILD files when running bazel build, you'd generate them ahead of time with Gazelle, put them in a directory, then use them with new_git_repository. This makes it a lot easier to customize builds for dependencies that don't quite follow conventions.

There is also some discussion on the bazel-discuss mailing list about having a central repository of BUILD files for various libraries. See meeting notes.

If the concern is about the final directory being named "vendor", would you be OK with a flag like --vendor_prefix=cmd and then always joining vendor to the end of it?

I think we should have a more general solution. If Gazelle sees a directory named cmd/vendor/ as it traverses the tree, it should be able to resolve imports from sources in cmd/ (and subdirectories) to packages in cmd/vendor/. It should support multiple vendor directories in different parts of the tree with appropriate visibility. We should be able to detect these automatically without a command line option.

@jmillikin-stripe
Copy link
Contributor Author

I think we should have a more general solution. If Gazelle sees a directory named cmd/vendor/ as it traverses the tree, it should be able to resolve imports from sources in cmd/ (and subdirectories) to packages in cmd/vendor/.

That wouldn't help in this case. The etcd code layout assumes there's a single vendor directory, it just happens they've named it cmd/vendor/ instead of vendor/. For example, //etcdmain/:go_default_library depends on //cmd/vendor/github.com/cockroachdb/cmux:go_default_library.

Perhaps, but this would add complexity. This would mean some options are available in one mode but not another. I'd prefer to have very little reliance on the command line.

Will the --external flag still exist, or is that going to be removed?

This is roughly what we want to do though. In #460, we're planning to remove go_repository. Instead of generating BUILD files when running bazel build, you'd generate them ahead of time with Gazelle, put them in a directory, then use them with new_git_repository. This makes it a lot easier to customize builds for dependencies that don't quite follow conventions.

What's the timeline for this work? From the discussion on #460 it sounds like removing go_repository is still in the early design phase and has no concrete plans to land in master any time soon. Would it be OK to merge this PR so go_repository does what we need? We can revisit when it comes time to remove go_repository entirely.

@jayconrod
Copy link
Contributor

I took another look at etcd, and it's stranger than I thought. cmd/tools is a symlink to ../tools. So that would mean that packages would build when built with the prefix github.com/coreos/etcd/cmd/tools but not with github.com/coreos/etcd/tools even though it's the same directory. This will confuse a lot of tools. go build works but go list does not for example. Bazel may be inconsistent as well. You could end up building the same package multiple times under different import paths.

I think supporting this in Gazelle is too much of a special case, and I'd rather not accept this PR. This is a situation where hand-written build files are needed.

When Gazelle supports flat mode (#608), it will help generate a file that can be used with new_git_repository after some manual adjustment. Flat mode will be a prerequisite for a lot of the other work, so it should happen soon rather than later, hopefully in the next week or two.

Will the --external flag still exist, or is that going to be removed?

Not sure yet. Ideally, it would be removed.

What's the timeline for this work? From the discussion on #460 it sounds like removing go_repository is still in the early design phase and has no concrete plans to land in master any time soon.

As a rough estimate, this should be done by the end of August. I'll be working on this primarily after Gophercon.

@jayconrod jayconrod closed this Jul 7, 2017
@jmillikin-stripe
Copy link
Contributor Author

It's now been ~3 months since this PR, and there doesn't seem to be any movement toward removing go_repository. Could you please reconsider your decision?

@jayconrod
Copy link
Contributor

I'm still planning to remove go_repository. #859 is a proposal to index the import paths of libraries in a repository and then resolve paths based on that. That should make make vendored mode largely obsolete, since vendored libraries will be found automatically. I'm not sure whether that would solve the etcd problem adequately though, since vendored libraries should only be visible below the vendor's parent directory.

In the mean time, I'm okay with landing the portion of this PR that allows vendoring to be turned on in go_repository. I'd suggest that attribute be named build_external and be an attr.string with a set of legal values instead of attr.bool. I don't want to support a configurable location for vendor though.

@jmillikin-stripe
Copy link
Contributor Author

Done. I moved it to a new PR since nearly all the discussion here is about the vendor prefix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants