Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Open Source AWSTOE #102

Open
commiterate opened this issue Sep 1, 2024 · 5 comments
Open

[Feature Request] Open Source AWSTOE #102

commiterate opened this issue Sep 1, 2024 · 5 comments
Labels

Comments

@commiterate
Copy link

commiterate commented Sep 1, 2024

Feature Request

Open source AWSTOE.

This can let 3rd parties help add support for additional operating systems (e.g. macOS, NixOS, Alpine Linux) either through code contributions or package distribution (e.g. packaging ecosystems that prefer building from source over pre-compiled binaries). For AWS, this can quickly expand the list of supported operating systems.

This will also bring AWSTOE in line with other open sourced AWS agents, including:

Additional Context

EC2 Image Builder currently provides the only avenue for users to manage EC2 AMIs and AMI lifecycle policies purely through AWS-provided CloudFormation resources (e.g. AWS::ImageBuilder::Image, AWS::ImageBuilder::LifecyclePolicy).

In comparison, other AMI baking solutions such as S3 → VM image import or HashiCorp Packer require a lot of extra supporting infrastructure to properly track and clean up old AMIs.

macOS in particular can't even use the S3 → VM image import path since neither the original EC2 ImportImage API nor the EC2 Image Builder ImportVmImage API support macOS VM images.

These APIs also might not even support AArch64 (a.k.a. ARM64) VM images since the docs imply that ImportVmImage is really just a convenience wrapper around ImportImage which doesn't support importing ARM64 VM images.

This makes Image Builder particularly attractive because it supports i386, x86-64, and AArch64.

Unfortunately, Image Builder seems to require AWSTOE to work. This is because CreateImageRecipe requires at least 1 image builder component (CreateImageRecipeRequest.components) to be specified.

As a result, users can't just work around this by only specifying CreateImageRecipeRequest.additionalInstanceConfiguration.userDataOverride to set up the instance and then have Image Builder create AMIs without AWSTOE (just needs to snapshot the instance after the user data script completes).

In particular, I was looking to create a NixOS AMI by only specifying a user data script and have Image Builder snapshot the resulting instance.

To put it shortly:

  • Nix is a package manager that's a bit like Amazon's Brazil package manager (mentioned publicly in the AWS Builders' Library).
  • NixOS is a Linux distribution that uses Nix as its core package manager (takes the role of apt, dnf, yum, etc.).
    • Think of declaring a Linux distribution from a Brazil Config file, Brazil VSR, and Apollo VFI.

Nix doesn't install packages to filesystem hierarchy standard (FHS) directories like /usr/bin which can cause problems for things that hardcode paths like the Amazon CloudWatch Agent's start-amazon-cloudwatch-agent binary. See aws/amazon-cloudwatch-agent#1319 for an explanation of Nix install paths and the CloudWatch Agent issue.

(cc: @arianvp NixOS AMI and amazon-init NixOS service module maintainer)

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@arianvp
Copy link

arianvp commented Sep 1, 2024

FWIW I would suggest just using https://github.com/nixos/amis with a custom NixOS config. Booting up a NixOS image with user data and then snapshotting sounds very wasteful when you can just build and upload snapshot directly using a nix build:

You can do:

nix run github:NixOS/amis#upload-ami -- --s3-bucket my-bucket --image-info $(nix build .#nixosConfigurations.config.system.build.amazonImage)

Indeed you can't use ImportImage. But ImportSnapshot works fine. And that's what we use for NixOS AMI uploads

@commiterate
Copy link
Author

commiterate commented Sep 1, 2024

Oh nice, didn't realise ImportSnapshot existed and seems to have much fewer restrictions.

I agree booting up a NixOS instance in EC2 for this is wasteful, but the cost is small compared to the significant quality of life benefits CloudFormation brings in CI/CD environments.

In particular, CloudFormation will auto-delete the old AMI if a new one needs to replace it during stack update. Image Builder also automates a lot of extra things like AMI metadata + sharing/distribution which we need to figure out how to do with direct API calls otherwise.

When combined with an UpdatePolicy on AWS::AutoScaling::AutoScalingGroup, I can have CloudFormation do this for each stack update:

  1. Create the new AMI.
  2. Create a new EC2 launch template with the new AMI.
  3. Do a rolling instance replacement on the auto-scaling group.
  4. Delete the old EC2 launch template.
  5. Delete the old AMI.

Doing local NixOS VM image builds with NixOS/amis or nixos-generators also isn't always feasible since it requires multiple systems or a single system that has emulation enabled (e.g. binfmt which NixOS users enable with the boot.binfmt.emulatedSystems option) for emulated native compilation.

To be clear, I don't think the official NixOS AMI build pipeline should swap over to EC2 Image Builder. We still need a starting AMI for Image Builder to work so some bootstrap NixOS AMI is needed. With today's NixOS AMI setup, that would be from local build or Hydra. Long term that might be from installing Nix on, for example, the latest Amazon Linux and building on an EC2 instance.

Rather, it might make sense for the base NixOS AMIs to package AWSTOE out-of-the-box (once it's open sourced and packaged in Nixpkgs) so people can use those in Image Builder.


For context, I'm looking at setting up an auto-scaling GitLab Runner fleet in EC2 that's deployed with GitLab Pipelines. This runner fleet needs to support x86-64 + AArch64 Linux (NixOS or bring-your-own-userspace with OCI containers), AArch64 macOS, and x86-64 + AArch64 Windows.

Most people at my company use Apple Silicon MacBooks (so aarch64-darwin) or x86-64 Windows laptops so they can't generate both x86-64 + AArch64 NixOS VM images locally (can't cross-compile easily). The aarch64-darwin machines might be able to if they use the NixOS/nix OCI container image to run the x86-64 variant through Rosetta, but it's neither super straightforward nor reliable in the long run (what if Apple deprecates Rosetta).

Essentially, we have to assume that:

  1. The bootstrap machine (either a local machine or a one-off GitLab runner) can only build and upload CloudFormation templates.
    • This bootstrap process has to be well documented as we may need to repeat it if the setup becomes broken or if our company spins up another self-hosted GitLab instance.
  2. There is a base NixOS AMI we can use to build customized NixOS AMIs (e.g. has gitlab-runner installed before boot instead of installing it at boot with amazon-init + EC2 user data).

Even if we could easily generate all the NixOS VM images locally, we still need to handle macOS and Windows. It's much easier to unify everything in Image Builder (can handle Linux and Windows today. I imagine macOS is planned) than having a Nix-specific AMI build process and Image Builder for the rest.

@arianvp
Copy link

arianvp commented Sep 1, 2024

Feel free to bounce off ideas with me in #aws:nixos.org on Matrix. We're running a setup with pretty similar requirements. Just GitHub Actions instead of GitLab

@commiterate
Copy link
Author

Spoke with a Principal Engineer for Image Builder.

Currently not on the roadmap and, with Image Builder having support for workflows which only use the SSM Agent unlike component-based recipes which use both SSM Agent and AWSTOE, this isn't as necessary for NixOS image build pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants