-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
releng: Publish fast builds to a separate subdirectory #18290
Conversation
Still some questions to answer from the previous PR: From @BenTheElder:
From @spiffxp:
EDIT: I'll tie the histories together this week and put a plan up in kubernetes/sig-release#850. |
Unknown CLA label state. Rechecking for CLA labels. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@justaugustus Thanks for trying to find a better solution for this.
Is there any way to force cross-builds to run at least on daily basis until a decision is reached here? |
@hakman -- I'd prefer to solve this instead of thinking of a workaround. I think we're close here. |
@BenTheElder @spiffxp -- More details, as requested. Version markers are text files stored in the root of various GCS buckets:
They represent the results of different types of Kubernetes build jobs and act as sort of a public API for accessing builds. One can see them leveraged in extraction strategies for e2e tests, release engineering tooling, and user-created scripts. Unfortunately, the way certain version markers are generated and utilized can at best be confusing, and at worst, disruptive. There are a variety of problems, some of which are symptoms of the other ones... FIXED - Cross builds are stored in a separate GCS bucket(Fixed in #14030.) This makes long-term usage of cross builds a little more difficult, since scripts utilizing version markers tend to consider only the version marker filename, while the GCS bucket name remains unparameterized. FIXED - Generated jobs may not represent intention(Fixed in #15564.) As the generic version markers can shift throughout the release cycle, every time we regenerate jobs, they may not represent what we intend to test. The best examples of this are pretty much every job using the FIXED - bazel version markers appear to be unused(Fixed in #15612.) Generic version markers are not explicitWe publish a set of additional generic version markers:
Depending on the point in the release cycle, the meaning of these markers can change.
Knowing what these markers mean at any one time presumes knowledge of the build/release process or a correct interpretation of the Kubernetes versions docs, which is often out of date and in low-visibility location. Manually created jobs using generic version markers can be inaccurateNon-generated jobs using generic version markers do not get the same level of scrunity as ones that are generated via This leads to inaccuracies between the versions presumed to be used in test and the versions that may be displayed in testgrid.
test-infra/config/jobs/kubernetes/sig-cli/sig-cli-config.yaml Lines 85 to 112 in 96e08f4
All variants of that prowjob have landed on the linux/amd64 version markers are colliding with cross builds"Fast" (linux/amd64-only) builds run every 5 minutes, while cross builds run every hour. The Kubernetes build jobs have a mechanism for checking if a build already exists and will exit early to save on test cycles. What this means is if a "fast" build has already happened for a commit, then the corresponding cross build will exit without building. This has been happening pretty consistently lately, so cross build consumers are using much older versions of Kubernetes than intended. (Note that this condition only happens on I'd like to establish a rough plan of record to continue iteratively fixing some of these issues. Plan of record
|
@BenTheElder -- it's kind of scattered across Slack across multiple release cycles 😭 The previous recent PRs were specifically meant to not impact existing consumers. This one:
For people using For people using generic version markers, there should be no impact as we haven't changed the generic markers. After the cross builds have started building again, I'd update jobs using generic version markers to use |
Recent Slack convo: https://kubernetes.slack.com/archives/C2C40FMNF/p1593720698263900 |
Let's move the info / checklist to an issue (pretty please!) will get lost here. Also next time we should try to open an issue early so we have a record (i hate slack search!) thanks, |
@dims -- Already opened and ref'ed in the PR description: kubernetes/sig-release#850 I opened that last November and have been threading incremental fixes to it since. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My overall comment on this PR is that I'm fine undoing the version markers change that was recently introduced. But I'm hesitant to move forward on the new addition "fast" path without a more thorough review of where we're headed.
Is the fast build stuff strictly required to unblock cross builds or is there a more straightforward revert that can be done to unblock cross?
Trying to recap what was discussed over zoom Thursday:
|
It would be great to get more generic descriptions of when to use which marker documented. For example, It's not quite clear to me when I would use latest.txt rather than master.txt It's also not overly clear to me when the beta transitions from a released version back to the state of master, is that only after beta builds are started for the next release? Generic descriptions for
Then the naming itself might make the use cases a bit clearer. Last thought, would it be possible to have a marker that represents:
That would allow for jobs to be defined in a way that they represent the latest build for the "next minor" version of kubernetes without having to potentially swap it between the yet to be released release branch and |
Signed-off-by: Stephen Augustus <[email protected]>
Signed-off-by: Stephen Augustus <[email protected]>
e0288ad
to
03a7828
Compare
@BenTheElder @spiffxp -- Version marker doc has been updated. PTAL. |
Signed-off-by: Stephen Augustus <[email protected]>
03a7828
to
7fe6df1
Compare
- `k8s-stable1` | ||
- `k8s-stable2` | ||
- `k8s-stable3` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @detiber 's suggestion is a bit more intuitive:
- `k8s-stable1` | |
- `k8s-stable2` | |
- `k8s-stable3` | |
- `k8s-stable` | |
- `k8s-stable-1` | |
- `k8s-stable-2` | |
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documents the current state, where "current" is what exists today in addition to what would happen once this PR merges.
There is no k8s-stable
marker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks.
@detiber @jsturtevant -- Can you read through the docs in this PR and let me know if they're sufficient for your needs? Note: I'm only aiming to unblock cross builds here... any other topics around this should be discussed in kubernetes/sig-release#850. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That versions rewrite is really comprehensive, looks great. I would have been satisfied with far less to unblock. My main concern is making sure release-blocking jobs get back to the low-latency merged-pr-to-new-build path this PR is going to kick them out of.
### CI - cross build | ||
|
||
Use `gsutil cat gs://kubernetes-release-dev/ci/latest.txt` (`master` branch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implication with this change is that any job depending on latest is going to have a 90m-2h lag time from PR merge to exercising the merged results, where before it was 20-30m. I suspect release-team will move quickly to update release-blocking jobs to use latest-fast.txt
instead. A heads up to kuberentes-dev@ would be appreciated.
**Directory:** `ci` | ||
|
||
#### latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting nit: it's visually difficult to tell the difference between these. Maybe bullet the bolded items?
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justaugustus, spiffxp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks @spiffxp! PR to follow to swap the markers on blocking jobs. |
@justaugustus: Updated the
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Opened #18517 to fix up jobs using the |
(#18163 got thrashed w/ a bunch of old commits, so opening a new PR.)
Remove extraneous
latest-{{.Version}}-cross
markerslatest-{{.Version}}
are inherently cross builds, so having thelatest-{{.Version}}-cross
markers is redundant.(This commit matches another in [WIP] releng: Add stable4 and remove beta generated jobs #18169, so it'll get rebased out of
whichever PR lands last.)
kubetest: Enable extract strategy for fast (linux/amd64-only) CI builds
scenarios: Allow kubernetes_build to recognize fast builds
Inverts the logic in #18158.
Explained in more detail here: kubernetes/release#1389
Signed-off-by: Stephen Augustus [email protected]
/assign @BenTheElder @spiffxp @cblecker @dims
cc: @kubernetes/release-engineering
/hold for review
ref: kubernetes/sig-release#850, kubernetes/sig-release#759