-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: go module proxy #13645
Comments
kind has had great success with using proxy.golang.org by default. IMHO whatever we use needs to be overridable (eg regional availability concerns) but that's easily accomplished with As I understand we are slated to essentially get this built in to golang 1.13 soon (defaulting to the golang.org proxy). |
See: https://proxy.golang.org/ in your browser which discusses go 1.13 and defaults. |
Well I'm also wondering if our CI should run a local proxy, and populate the environment variable right into the job spec. I think it could reduce network flakes, as well as speed up jobs if we don't have to re-download stuff each time. |
I think that might make sense for a non GCP Prow but significantly less sense given Prow is also in Google data centers.
proxy.golang.org is already local to Google data centers (as is CI) -- I don't think it will get significantly more local or have better uptime. If anything I'd expect our Prow local cache to be more likely to have downtime and no SLA / SRE ...
You still have to download, even locally -- just from a faster more optimal service with better uptime and a database of known module hashes (and I think we're better protected by sharing those with more projects). Local won't be less download-y than the standard proxy? FWIW I've had zero flakes using it (did have plenty without it) and haven't seen any reported issues or downtime. |
I like vendor. |
@akutz vendor still makes sense for k/k's project dependencies, but I'm thinking that for repos that either have simple dependencies or only are vendoring tools (like misspell), being able to remove the vendor is a positive thing. |
Maybe. Then again, regardless of whether a dependency is small, large, or tooling — if it’s Golang-based and is or generates a compile-time requirement, then it’s beneficial to have that dependency vendored as well. Pre-building tooling such as linters or post-build tooling such as static code analyzers, then sure, shove them into an external location all day. |
Example PR ditching vendor in favour of the proxy: kubernetes/community#3934 |
I used to think this but ... Checking in Modules + Shallow clones are nifty but don't really work with git tags (which most go repos use) and generally aren't used by most systems. We also get the benefit of any shared proxy collectively observing hashes for modules and catching any changes to published packages. Downstream users are free to use their own local cache (EG company hosted) to further improve download speed and implement security & compliance controls. Additionally, with checked in vendor we need to run slow CI verification downloading from many arbitrary hosts to ensure that these dependencies download as we expect (
This is an especially huge win if you use a separate module for these tools (repos can have multiple) as it means consumers that are not doing development do not need to download any of this linter code etc. |
Hi @BenTheElder,
I was the same way and came back around. I started coding in C and moved to C#, in a land where MS didn't provide any artifact management. Going to Java and Maven seemed like a godsend. So much so that I created nvn -- Maven for .NET -- nearly 10 years ago (and eventually contributed to codeplex and nuget). When I first started working with Golang I immediately adopted the same behavior, using Glide to pull in dependencies at build-time and I understand that for larger projects this is a hassle, but if I may misquote Cat Stevens and say the first clone is the deepest. As @cblecker said, the proxy model helps to reduce network flakes. Know what completely gets ride of them? Having all the code you need to build a project committed alongside the project. It's become such an important and useful part of my personal experience with the Go language that I truly hope Russ never completely removes the concept of I hope @mattfarina and @sdboyer don't mind me pinging them. I'm curious what the beautiful minds primarily responsible for |
Looking back and re-evaluating is how we improve things, right? 🙃
The first clone is the one we do ~10,000 times a day in CI, and for any new contributors. It's also true for modules, but strictly smaller (quite a lot so for Kubernetes).
That's not true at all. Nothing can completely get rid of them, we have flakes today cloning repos. Cloning from source control is more bandwidth and more endpoints than downloading the current version of the sources exclusively (not to mention the caching etc.) |
Hi @BenTheElder,
Touche. But I think you know I meant that I haven't seen a good enough reason to change. And for what it's worth, I moved from
I thought there was some level of caching in the Prow clusters to reduce the overhead of this?
Again, fair. Please allow me to restate that it reduces the number of things that are able to flake. For binary artifacts I'm all for having dedicated artifact management -- Git and other version control systems designed for plain-text files are terrible at managing binary dependencies. For textual dependencies, Golang has this unique feature that I think adds to the value of the language. I suppose I just don't see a lot of difference between a Go module service and a shallow git clone. Sure, one is more straight-forward and simpler in implementation and execution -- but in the end they both just add points-of-failure. With |
Hi @BenTheElder / @cblecker, By the way, I don't think |
Agreed. For something like k/k where I don't think we're ready to get rid of vendor, using the proxy still helps for those vendor verification steps. That said, in go 1.13, we would have to manually force the use of vendor as it will be ignored by default. /shrug |
And this makes me incredibly sad. It's such a useful feature. And if it's being switched to "opt-in," then it's only a matter of time before it is deprecated and removed entirely. sigh |
None. There's bazel build output caching, and GitHub API call caching (for prow itself) but those are orthogonal.
It only reduces by one versus the module proxy, and the module proxy so far has been more reliable than github .. it has a different workload.
Many of our dependencies contain large binary artifacts (EG compiled proto, gobindata) packed into source code, so it's not really either-or.
Shallow clones are pretty much off the table because git looks up tags by walking the history. You pretty much have to stop using git tags to use shallow clones, which doesn't really sound realistic or feasible for most of our projects.
The module proxy is a single additional point over your primary source hosting, and unlike your original source hosting it can be trivially swapped out with strong guarantees that you'll get the same results. There are multiple public module proxies. Also FWIW, what the module system actually does:
So in effect it's actually even cheaper than a shallow clone as you download a pre-computed compressed archive served from ~a hashmap of That last point is especially beneficial to an ecosystem of projects sharing libraries. For some operations like updating / verifying dependencies it is significantly less points of failure.
True, however you have many ways to obtain these including multiple public proxies, a corporate / organizational proxy, and your local cache which de-dupes downloading these. For local development you should be fine with the local cache except when deps change, and for CI we already depend on the internet and can switch caches (thanks to those gosum hashes).
Right.
This. 1.13 releases this fall. Early results suggest this is going to work very well. |
Hi @BenTheElder / @cblecker, So perhaps this issue should be framed as "when" versus "if". I do think they're two different discussions, but if the sun is setting on
|
If we opt to not roll our own proxy (based on the fact that the official proxy is already google hosted, I agree that we don't need to roll our own for prow.k8s.io), then it could be as easy as injecting the variable: Then any go module operation will consume that. |
Hi @cblecker, Agreed. I'm saying though that perhaps this conversation should be about the work required to figure out the work required to move to the use of a |
So.. shouldn't require any work except:
|
Hi @cblecker, What did you just read? Because I just read @BenTheElder volunteering to lead the |
#13648 |
we already switched k/k to modules and aren't opting out of this, but we'll need to updated a few places to make sure it propagates to the build containers etc. should be straightforward :-) |
/unshrug |
@spiffxp: ¯\_(ツ)_/¯ In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We're using go modules in a number of different repos, including k/k. I'm wondering if it's time we explore either using or deploying our own go module proxy.
Benefits:
Downsides:
cc: @kubernetes/sig-testing
The text was updated successfully, but these errors were encountered: