-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mantle/ore: gcp: add image family support, add deprecate image functionality #1319
mantle/ore: gcp: add image family support, add deprecate image functionality #1319
Conversation
this builds on #1305 |
src/cosalib/gcp.py
Outdated
# Here is where I would do something like: | ||
if args.deprecate: | ||
run_verbose(ore gcloud deprecate-image --image-name gcp_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we could either encode a seperate call to ore here if the user requested the image to be created in the deprecated state OR we could modify ore gcloud upload
to accept a --deprecated
argument as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to my understanding there are two different times when we'd want to deprecate an image:
- When we first upload it
- To mark the old active image as deprecated in preparation for a new active image
For the first case that should probably be the default behavior performed by ore gcloud upload
(with a flag to turn off that behavior). The second should always be done as a separate operation so that it's more obvious and intentional that an image (which is not the one being uploaded) is being marked as deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to my understanding there are two different times when we'd want to deprecate an image:
1. When we first upload it 2. To mark the old active image as deprecated in preparation for a new active image
correct. with a slight addition to 2
. For 1
it happens when we
first do the upload (during the pipeline run). For 2
it would happen
during the release and we'd need to do two things:
- set deprecation status to "DEPRECATED" for the image that is currently latest in image family
- set deprecation status to "ACTIVE" of new image
So we'd make two calls to do that.
For the first case that should probably be the default behavior performed by
ore gcloud upload
(with a flag to turn off that behavior). The second should always be done as a separate operation so that it's more obvious and intentional that an image (which is not the one being uploaded) is being marked as deprecated.
Since uploading an image that you immediately want to deprecate is kind of a special weird use case we're implementing I thought it would be more appropriate to leave it as not the default and explicitly pass --deprecated
or something in the pipeline. For example, I think RHCOS uses this functionality and I don't think we want to change the defaults there right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct. with a slight addition to
2
. For1
it happens when we
first do the upload (during the pipeline run). For2
it would happen
during the release and we'd need to do two things:
- set deprecation status to "DEPRECATED" for the image that is currently latest in image family
- set deprecation status to "ACTIVE" of new image
So we'd make two calls to do that.
I'm personally alright with it being two calls (with the caveat that it might make sense to have it be one COSA command which performs both ore actions).
Since uploading an image that you immediately want to deprecate is kind of a special weird use case we're implementing I thought it would be more appropriate to leave it as not the default and explicitly pass
--deprecated
or something in the pipeline. For example, I think RHCOS uses this functionality and I don't think we want to change the defaults there right now.
That's fair. We could potentially do something like switching the default for FCOS (since we're already passing a flag directly indicating if it's an FCOS upload) but I'd rather stick to just directly passing it in the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly what I was thinking. Cool I'll start down that path.
|
||
func init() { | ||
cmdDeprecateImage.Flags().StringVar(&deprecateImageName, | ||
"image-name", "", "GCP image name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw I couldn't figure out either what the global gcloud image option is used for:
sv(&opts.Image, "image", "", "image name") |
or how to grab it and use it here. This kept me from just using --image
as the argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should just be able to reference opts.Image
to get the value that was passed into the --image
argument (which is on all commands of the GCloud
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'm now using opts.Image
, but this did require me to slightly change the init code. See the mantle/gcloud: don't mutate Image arg if not provided
commit.
278ee0f
to
c402076
Compare
|
||
func init() { | ||
cmdDeprecateImage.Flags().StringVar(&deprecateImageName, | ||
"image-name", "", "GCP image name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should just be able to reference opts.Image
to get the value that was passed into the --image
argument (which is on all commands of the GCloud
variable.
os.Exit(1) | ||
} | ||
|
||
fmt.Printf("Attemping to change GCP image deprecation state of %s to %s\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably fits better as a plog.Debugf
statement to gate it behind the user requesting more verbose output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I switched this to plog.Debugf
and also the other statements in the file to plog.Fatal*
. The one thing, I can't figure out how to get the debug statement to print out. I've tried --debug
, --log-level=DEBUG
, and --verbose
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOB IRC discussion happened but looks like you've found a bug: #1321
src/cosalib/gcp.py
Outdated
# Here is where I would do something like: | ||
if args.deprecate: | ||
run_verbose(ore gcloud deprecate-image --image-name gcp_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to my understanding there are two different times when we'd want to deprecate an image:
- When we first upload it
- To mark the old active image as deprecated in preparation for a new active image
For the first case that should probably be the default behavior performed by ore gcloud upload
(with a flag to turn off that behavior). The second should always be done as a separate operation so that it's more obvious and intentional that an image (which is not the one being uploaded) is being marked as deprecated.
978a045
to
5e19713
Compare
rebased this on top of #1322 Right now this just adds support for image family to upload.go and also support for a new deprecate-image subcommand. It doesn't currently change I discuss the flow with @bgilbert in IRC and it seems like we may want to go a different path so we'll leave that bit for a followup. This is ready for review, but I'm still doing some testing in the background so I'll leave it marked as draft. |
src/cosalib/gcp.py
Outdated
@@ -106,4 +108,7 @@ def gcp_cli(parser): | |||
Currently enables SECURE_BOOT and UEFI_COMPATIBLE""", | |||
action="store_true", | |||
default=False) | |||
parser.add_argument("--family", | |||
help="GCP image family to attach disk to", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/disk/image/
mantle/cmd/ore/gcloud/upload.go
Outdated
@@ -54,6 +55,7 @@ func init() { | |||
cmdUpload.Flags().BoolVar(&uploadFedora, "fcos", false, "Flag this is Fedora CoreOS (or a derivative); currently enables SECURE_BOOT and UEFI_COMPATIBLE") | |||
cmdUpload.Flags().BoolVar(&uploadForce, "force", false, "overwrite existing GS and GCE images without prompt") | |||
cmdUpload.Flags().StringVar(&uploadWriteUrl, "write-url", "", "output the uploaded URL to the named file") | |||
cmdUpload.Flags().StringVar(&uploadImageFamily, "family", "", "GCP image family to attach disk to") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/disk/image/
5e19713
to
03a4c60
Compare
We want to use Image Families so that a user can specify an Image Family (like `fedora-coreos-stable`) and that will always reference the latest stable Fedora CoreOS GCP image. This will allow us to do that.
If the user didn't specify any Image argument then there's no reason for us to try to mutate it into something useful. Just leave it as an empty string so later consumers can also do a useful check.
This new command allows us to change the deprecation state of images in GCP. See https://cloud.google.com/solutions/image-management-best-practices#deprecating_an_image
Fixes some whitespace.
03a4c60
to
1a3f913
Compare
rebase on top of latest master and addressed all code review comments - doing final testing now in a custom build pipeline |
In cosalib/aws we don't try to add `s3://` and a prefix/path onto the provided bucket so let's not do it here either.
ok this is ready for review. cc @cgwalters since the |
help="""Flag this is Fedora CoreOS (or a derivative); | ||
Currently enables SECURE_BOOT and UEFI_COMPATIBLE""", | ||
action="store_true", | ||
default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this be off by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also openshift/installer#2921 which merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming we want to have SECURE_BOOT and UEFI_COMPATIBLE everywhere we could just get rid of the flag altogether and bake it in at a lower level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I'd prefer to do this in a follow up if you don't mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming we want to have SECURE_BOOT and UEFI_COMPATIBLE everywhere we could just get rid of the flag altogether and bake it in at a lower level.
That's exactly the status quo today, it's baked into ore
by default but you went out of the way to explicitly disable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind it's off by default, I was wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
follow-up to make SECURE_BOOT
and UEFI_COMPATIBLE
standard and remove the --fcos
option: #1333
Requiring simultaneous changes to a pipeline and cosa is unfortunate. How hard would it be to make this compatible? |
High level makes sense, though there are no plans to upload RHCOS as an image family right? GCP is so much better than AWS in this way though, in terms of having a global image identifier, not per region AMIs etc. |
That said since we did branch and this only affects cloud image uploads I don't care too much; we can easily get a cosa build and pipeline updated in the same day. But it's usually worth at least trying to "ratchet" changes in a safe way if it's not too hard. |
Right. No plans that I know of right now. Since the openshift-installer chooses the image for you I don't know if it would have any benefit.
I know right? It's much better than having to track AMI IDs across regions.
I think I'd prefer to not introduce compatibility code but I could see doing something like:
If you think it'd be worth it. |
/lgtm |
Carrying the two lines of code for a bit in order to avoid a potential pipeline stall would definitely be something I would have done, yes. |
This should make it so our RHCOS pipeline won't hard fail until we get it updated.
added the suggested compat code - will merge once I get CI passing |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ashcrow, bgilbert, cgwalters, dustymabe The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
RHCOS pipeline is failing with
|
opened a downstream PR |
In GCP we'd like to use Image Families [1] so that a user can specify an
Image Family (like
fedora-coreos-stable
) and that will always referencethe latest stable Fedora CoreOS GCP image. The way image families work
is that when the image family is referenced the latest non-deprecated
image from that Image Family is returned. We'd like to have only the
latest released image be in a non-deprecated state in our image families
at any given time.
The strategy for how we use this mechanism is that we create images
in our pipeline when the pipeline runs. The image will be associated
with an image family at that time (unfortunately you can't associate
an image with an image family afterwards), but we will immediately
mark it as deprecated so it won't be used. Upon running the release
pipeline we will mark the new image as ACTIVE and mark the old ACTIVE
image as DEPRECATED.
[1] https://cloud.google.com/compute/docs/images#image_families