-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rhcos: implement image discovery for new pipeline #554
Conversation
I'm not sure what happened there... /retest |
/cc @cgwalters |
@@ -288,13 +289,24 @@ func (a *platform) libvirtPlatform() (*types.LibvirtPlatform, error) { | |||
return nil, err | |||
} | |||
|
|||
// TODO: Ideally, this would live inside of a closure which is passed to | |||
// asset.GenerateUserProvidedAsset and only called if the environment | |||
// variable isn't present. As this exists, it ruins the abstraction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort of problem goes away with #556, where all dependencies and defaults are only rendered as needed 🎉
pkg/rhcos/ami.go
Outdated
// AMI calculates a Red Hat CoreOS AMI. | ||
// TODO: Use the new RHCOS build pipeline to discover AMIs. The pipeline isn't | ||
// currently exposed publicly and CI isn't on the VPN, so this cannot be | ||
// updated without breaking the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only workable if we have the same internal and external release cadence. See discussion in #281 and openshift/os#353.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I will look at making more of the pipeline public. There are some prerequisites for this.
How hard would it be to put some of this data in the pull secret though or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard would it be to put some of this data in the pull secret though or so?
What data? A URI with an initial entry-point for this discovery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What data? A URI with an initial entry-point for this discovery?
I'd say all the release metadata - including the meta.json
fetched by this PR, which includes AMI ids, along with link to qcow2 etc.
Hmm. Or maybe the pull secret grants access to a server running on api.ci or wherever that hands out the metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, one thing this circles back to is...we could stick all of the relevant metadata in a container, which the pull secret already grants access to of course.
Would having this be in a container be ergonomic for the installer? It'd depend on oc adm release extract
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all sounds far too complex for what we are after. A few notes on the discussion:
- We shouldn't depend on the URL remaining unknown in order to protect the RHCOS images. Hiding this in a container behind a pull-secret doesn't buy much.
- The RHCOS image reference should come from the CVO manifest payload (just like all of the other resources). We need to wait until MAO can create masters before we can make this jump since I don't want the installer parsing through the CVO manifest payload.
A lot of this is temporary scaffolding, so I'd rather we focus the discussion on the longer-term vision rather than this code which we know we are going to throw away in a few weeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RHCOS image reference should come from the CVO manifest payload (just like all of the other resources). We need to wait until MAO can create masters before we can make this jump since I don't want the installer parsing through the CVO manifest payload.
What approach would you use for the always-pre-MAO bootstrap (which needs an AMI too ;)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't depend on the URL remaining unknown in order to protect the RHCOS images. Hiding this in a container behind a pull-secret doesn't buy much.
True long term.
The RHCOS image reference should come from the CVO manifest payload
Agree. What's fuzzy for me is the dependencies required for that. It also forces some deep issues like...when we branch OKD, are we going to try to maintain a separate OS stream for that? Or do we just keep the same OS and ship updated kubelets too? Or a mix where we ship the same "base OS", and just pull different kubelets? We've talked about this in the past, and what hurts my head is that the complexity jumps up when one notes we have two major version streams for RHCOS today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you guys confirm, CVO manifest payload = origin-v4.0:release image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed.
Needs a rebase |
RHCOS has moved over to the 2.0 pipeline which has a nicer discovery mechanism for images. This uses that new pipeline to fetch the latest QCOW images for the latest build. Unfortunately, there is no "latest" alias anymore, so the installer has to fetch resources from the Internet before it can prompt the user. This will go away once the installer starts pinning to a specific RHCOS release.
if nextImage.ImageId == nil || nextImage.CreationDate == nil { | ||
continue | ||
for _, ami := range meta.AMIs { | ||
if ami.Name == region { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to land this before the AMIs are being pushed to other regions?
$ BUILD=$(curl -sL https://releases-rhcos.svc.ci.openshift.org/storage/releases/maipo/builds.json | jq -r '.builds[0]')
$ echo "${BUILD}"
47.38
$ curl -sL "https://releases-rhcos.svc.ci.openshift.org/storage/releases/maipo/${BUILD}/meta.json" | jq .amis
[
{
"name": "us-east-1",
"hvm": "ami-03afbcc8b72560ac8"
}
]
I'd rather stick to scraping AWS until we get that, because scraping AWS has a chance at working in other regions if folks can copy the AMI over to the target region. That would allow us to use the other regions we're trying to use for openshift-dev
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add an environment variable for the AMI. We already have the override for the QEMU image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add an environment variable for the AMI. We already have the override for the QEMU image.
I'm ok with that, although see this comment about not having AMI/image knobs and assuming that there's a default AMI/image that is (1) available and (2) healthy enough to pivot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're close to landing the multi-region again although this is going to slow our pipeline notably. There are two things we need to do; first on the RHCOS side, implement tags, and only upload to all the regions for tags (e.g. once a day at most?).
Second though we need to do openshift/os#289 or possibly teach the installer to take a set of override RPMs and apply them via Ignition? Then I think we can slow the cadence of "promoted" builds, and anyone who wants to test bleeding edge OS work can use libvirt.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, crawford The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Do we want to land an environment variable override? Or are we ok re-breaking @ironcladlou and others who work in other regions until we get the multi-region pipeline going? |
An env override makes sense to me, but it'd be even more powerful to have an override for the URL; then one could use it to override every format. The only slight downside to this would be that one would have to hand-craft a |
Keep the environment variable (with a warning about using it), but drop the interactive prompt. The default is solid, and users manipulating it are more likely to break something (e.g. by continuing to use the old v1 pipeline), while the installer can update its default to track the RHCOS folks (e.g. like 1117821, rhcos: implement image discovery for new pipeline, 2018-10-26, openshift#554).
RHCOS has moved over to the 2.0 pipeline which has a nicer discovery
mechanism for images. This uses that new pipeline to fetch the latest
AMIs and QCOW images for the latest build. Unfortunately, there is no
"latest" alias anymore, so the installer has to fetch resources from the
Internet before it can prompt the user. This will go away once the
installer starts pinning to a specific RHCOS release.