Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for importing generic cloud images #182

Merged
merged 5 commits into from
Sep 12, 2022

Conversation

cevich
Copy link
Member

@cevich cevich commented Aug 24, 2022

DEPENDS ON:
#178

Downstream CI needs dictate early testing on new Fedora releases at the
Beta stage or earlier. Unfortunately at the time of this commit, the
Fedora cloud-sig does not provide ready-made beta images in AWS EC2.

Add a new image-build stage called import_images, that takes care of
bringing the generic images into AWS. Update the base_images stage
such that it will locate and use the imported images. Unify all the
build-stage wrapper scripts into a single/common ci/make.sh wrapper.

Finally, make a few semi-related comment/readability improvements in
various places.


Workaround packer amazon_import post-processor bug

Ref: https://github.com/hashicorp/packer-plugin-amazon/issues
number 264

Implement a rough and ready set of make targets as a stand-in for
the broken packer amazon_import post-processor. This works, but
is fairly fragile: There are several failure modes which cannot
easily be checked automatically. For example, the s3 upload can
silently fail and/or store corrupted data.

When the packer bug is resolved, this commit can most probably be
safely reverted. This should make the process entirely automated
(via the automation_images PR workflow), possibly with only
minimal updates.

Add a import_images/README.md reference for the semi-manual
process.

@cevich cevich requested a review from lsm5 August 24, 2022 21:11
@cevich cevich mentioned this pull request Aug 24, 2022
@cevich cevich force-pushed the import_images branch 14 times, most recently from 30c6ca8 to afe5a79 Compare August 30, 2022 19:59
@cevich
Copy link
Member Author

cevich commented Aug 30, 2022

Yay progress, I decoded that message and it's actually surprisingly helpful. I think I can fix the "permission denied" with a simple IAM policy update.

@cevich
Copy link
Member Author

cevich commented Aug 30, 2022

status code: 400, request id: a4757af1-9640-4e69-989e-4b9722860e3d

Blah! That's not helpful at all Amazon 😲

@cevich
Copy link
Member Author

cevich commented Aug 31, 2022

Found the error in AWS CloudTrail:

    "errorCode": "Client.InvalidParameter",
    "errorMessage": "The service role vmimport provided does not exist or does not have sufficient permissions",

import_images/handle_image.sh Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Copy link
Member

@lsm5 lsm5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (to the best of my experience :D). Will it be possible to see the results of image import in github comments prior to merge, or would that be only a post-merge thing?

@cevich
Copy link
Member Author

cevich commented Aug 31, 2022

LGTM (to the best of my experience :D). Will it be possible to see the results of image import in github comments prior to merge, or would that be only a post-merge thing?

This is mostly all brand-new to me, so our experience level is pretty close with this. Also why it's taking me so long to get it working 😞

Assuming I can get it working, this will allow us to support rawhide and development images in AWS and/or GCP. It removes all dependence on the fedora cloud-sig publishing images into AWS for us to consume. It modifies the build workflow to go like:

import_images -> base_images -> cache_images

So yes, everything will show up in the handy table 😁

@cevich
Copy link
Member Author

cevich commented Aug 31, 2022

Some slight progress, now the error (from AWS) says failed with status message: ClientError: Unknown OS / Missing OS files., error: ResourceNotReady: failed waiting for successful resource state

@cevich
Copy link
Member Author

cevich commented Aug 31, 2022

and retry count exhausted. Last err: InvalidParameter: Parameter architecture = arm64 has an invalid format.

@cevich
Copy link
Member Author

cevich commented Aug 31, 2022

@cevich
Copy link
Member Author

cevich commented Aug 31, 2022

Force-push: Trying a different disk-format (VHDX) instead of raw.

import_images/handle_image.sh Outdated Show resolved Hide resolved
import_images/handle_image.sh Outdated Show resolved Hide resolved
@lsm5
Copy link
Member

lsm5 commented Sep 9, 2022

import_images failed with:

You must specify a region. You can also configure your region by running "aws configure".
make: *** [Makefile:266: /tmp/automation_images_tmp/fedora-aws-1662750353.import_task_id] Error 255

Is this expected to be grabbed from the user's aws config? Or is it preset in automation_images?

@cevich
Copy link
Member Author

cevich commented Sep 12, 2022

Is this expected to be grabbed from the user's aws config? Or is it preset in automation_images?

Ahh right, okay. It's expected to be in the config file. However we only support one region: us-east-1. So I should probably just force that on all the aws command-lines.

@cevich
Copy link
Member Author

cevich commented Sep 12, 2022

BTW: Thanks @lsm5 for such a through look-though, I really appreciate the oversight.

@lsm5
Copy link
Member

lsm5 commented Sep 12, 2022

BTW: Thanks @lsm5 for such a through look-though, I really appreciate the oversight.

not really that thorough TBH :D , but happy to be useful. I will give this another run once you have the us-east-1 change pushed.

@cevich
Copy link
Member Author

cevich commented Sep 12, 2022

I will give this another run once you have the us-east-1 change pushed.

Thanks. Just testing them out now, along with a change to the way we upload the image to S3. I found a way to get S3 to verify a checksum on upload and retry on mismatch. That will make the upload more reliable.

At the time of this commit, there is a problem with aiohttp in python
3.11 shipping with F37.

aio-libs/aiohttp#6600

Temporarily force this particular container image to use F36 until the
problem is corrected.

Signed-off-by: Chris Evich <[email protected]>
***DEPENDS ON:***
containers#178

Downstream CI needs dictate early testing on new Fedora releases at the
Beta stage or earlier.  Unfortunately at the time of this commit, the
Fedora cloud-sig does not provide ready-made beta images in AWS EC2.

Add a new image-build stage called `import_images`, that takes care of
bringing the generic F37 cloud images into AWS.  Update the
`base_images` stage such that it will locate and use the imported
images.  Unify all the build-stage wrapper scripts into a
single/common `ci/make.sh` wrapper.

Finally, make a few semi-related comment/readability improvements in
various places.

Signed-off-by: Chris Evich <[email protected]>
Ref: https://github.com/hashicorp/packer-plugin-amazon/issues
     number 264

Implement a rough and ready set of make targets as a stand-in for
the broken packer amazon_import post-processor.  This works, but
is fairly fragile:  There are several failure modes which cannot
easily be checked automatically.  For example, the s3 upload can
silently fail and/or store corrupted data.

When the packer bug is resolved, this commit can most probably be
safely  reverted.  This should make the process entirely automated
(via the `automation_images` PR workflow), possibly with only
minimal updates.

Add a `import_images/README.md` reference for the semi-manual
process.

Signed-off-by: Chris Evich <[email protected]>
@cevich
Copy link
Member Author

cevich commented Sep 12, 2022

@lsm5 okay, I addressed all your concerns, fixed the region problem, and added support for verifying the MD5 of uploaded images before conversion. Testing still shows the upload is somewhat unreliable and a few times the AWS vmimport service just timed out or puked on otherwise valid content. It's almost as if they didn't want anybody to automate this 😢

Note: I rebased this onto #192 so it can all merge at once (assuming it's ready).

@lsm5
Copy link
Member

lsm5 commented Sep 12, 2022

ack, retrying...

Copy link
Member

@lsm5 lsm5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image import(s) successful.
############################################################
Please update Makefile value:

    FEDORA_IMPORT_IMG_SFX = 1663007754
############################################################
rm /tmp/automation_images_tmp/fedora-aws-1663007754.uploaded /tmp/automation_images_tmp/fedora-aws-arm64-1663007754.import_task_id /tmp/automation_images_tmp/fedora-aws-1663007754.ami.name /tmp/automation_images_tmp/fedora-aws-arm64-1663007754.ami.name /tmp/automation_images_tmp/fedora-aws-arm64-1663007754.uploaded /tmp/automation_images_tmp/fedora-aws-1663007754.md5 /tmp/automation_images_tmp/fedora-aws-arm64-1663007754.md5 /tmp/automation_images_tmp/fedora-aws-1663007754.import_task_id

/lgtm

@lsm5
Copy link
Member

lsm5 commented Sep 12, 2022

/hold

@github-actions
Copy link

Cirrus CI build successful. Found built image names and IDs:

Stage Image Name IMAGE_SUFFIX
base fedora b5121240166825984
base fedora-aws b5121240166825984
base fedora-aws-arm64 b5121240166825984
base image-builder b5121240166825984
base prior-fedora b5121240166825984
base ubuntu b5121240166825984
cache build-push c5121240166825984
cache fedora c5121240166825984
cache fedora-aws c5121240166825984
cache fedora-netavark c5121240166825984
cache fedora-netavark-aws-arm64 c5121240166825984
cache fedora-podman-aws-arm64 c5121240166825984
cache fedora-podman-py c5121240166825984
cache prior-fedora c5121240166825984
cache ubuntu c5121240166825984

@cevich
Copy link
Member Author

cevich commented Sep 12, 2022

There we go, okay perfect. Thanks again @lsm5

@cevich cevich merged commit c0801db into containers:main Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants