Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

capi: support Flatcar Container Linux #248

Merged
merged 10 commits into from
Dec 10, 2020

Conversation

dongsupark
Copy link
Member

@dongsupark dongsupark commented Jun 5, 2020

This is a cluster-API image builder for Flatcar Container Linux.

On the Packer side it includes packer/qemu/qemu-flatcar.json and packer/qemu/packer.json. In packer/config/ it also includes some other JSON files with configuration for ansible and packages installed with Ansible.

On the Ansible side there's some preliminary playbook in ansible/.

notes

  • images/capi/ansible/roles/setup/tasks/bootstrap-flatcar.yml installs python under /opt.
  • images/capi/ansible/roles/kubernetes/tasks/url.yml installs artificts of Kubernetes and CNI under /opt and /etc, mainly because /usr is read-only in Flatcar.
  • It unpacks only parts of containerd binaries, since Flatcar already has its own built-in containerd.
  • To make containerd work correctly, we install Flatcar-specific config files for containerd, so it listens on its containerd socket /run/docker/libcontainerd/docker-containerd.sock.
  • Since Flatcar has its own containerd socket, we need to also specify the socket name when running commands like kubeadm config images pull, /opt/bin/ctr images import as well as in /etc/crictl.yaml.
  • We do not delete /etc/kubeadm.yml for Flatcar, because later steps running kubeadm init require the config to be in place.

TODOs

  • Make sure it works for other platforms than qemu & AWS

/cc @vbatts @t-lo @dims @detiber @codenrhoden

@k8s-ci-robot k8s-ci-robot requested review from dims and detiber June 5, 2020 15:16
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 5, 2020
@k8s-ci-robot k8s-ci-robot requested a review from codenrhoden June 5, 2020 15:16
@k8s-ci-robot
Copy link
Contributor

@dongsupark: GitHub didn't allow me to request PR reviews from the following users: vbatts, t-lo.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This is a cluster-API image builder for Flatcar Container Linux.

On the Packer side it includes packer/qemu/flatcar.json and packer/qemu/packer.json. In packer/config/ it also includes some other JSON files with configuration for ansible and packages installed with Ansible.

On the Ansible side there's some preliminary playbook in ansible/.

notes

  • It needs to disable gather_facts in images/capi/ansible/node.yml, because it is not possible to run python before running setup tasks. Note that python is not installed by default in Flatcar.
  • To make ansible_env.SUDO_USER work with gather_facts disabled, setup: filter=ansible_env is needed in images/capi/ansible/roles/sysprep/tasks/main.yml.
  • images/capi/ansible/roles/setup/tasks/bootstrap-flatcar.yml installs python under /opt.
  • images/capi/ansible/roles/kubernetes/tasks/url-flatcar.yml installs artificts of Kubernetes and CNI under /opt and /etc, mainly because /usr is read-only in Flatcar.
  • images/capi/ansible/roles/sysprep/tasks/main.yml sets hostname with use: systemd in case of Flatcar. It is a workaround to avoid hostname detection errors when running ansible in Flatcar. We fixed the issue in upstream Ansible, but it is not included in any Ansible release yet.
  • It unpacks only parts of containerd binaries, since Flatcar already has its own built-in containerd.
  • To make containerd work correctly, we install Flatcar-specific config files for containerd, so it listens on its containerd socket /run/docker/libcontainerd/docker-containerd.sock.
  • Since Flatcar has its own containerd socket, we need to also specify the socket name when running commands like kubeadm config images pull, /opt/bin/ctr images import as well as in /etc/crictl.yaml.
  • We do not delete /etc/kubeadm.yml for Flatcar, because later steps running kubeadm init require the config to be in place.

TODOs

  • Make sure it works for other platforms than qemu
  • Get Makefile support multiple Flatcar channels and versions

/cc @vbatts @t-lo @dims @detiber @codenrhoden

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

Welcome @dongsupark!

It looks like this is your first PR to kubernetes-sigs/image-builder 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/image-builder has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 5, 2020
@dongsupark
Copy link
Member Author

/assign @vincepri

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
  • Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 8, 2020
@t-lo
Copy link
Contributor

t-lo commented Jun 8, 2020

Signed. Can we re-evaluate please?

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jun 8, 2020
@vincepri vincepri removed their assignment Jun 10, 2020
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 16, 2020
@dongsupark dongsupark force-pushed the dongsu/capi-flatcar branch from 9e35bd9 to 15c3800 Compare June 16, 2020 08:23
@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 16, 2020
@dongsupark dongsupark force-pushed the dongsu/capi-flatcar branch from 15c3800 to 9737ee3 Compare June 16, 2020 08:25
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 16, 2020
@dongsupark
Copy link
Member Author

Rebased, and added AWS AMI support.

@@ -0,0 +1,27 @@
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6NF8iallvQVp22WDkTkyrtvp9eWW6A8YVr+kz4TjGYe7gHzI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongsupark was this checked in by mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the key is exactly the same key as the vagrant insecure ssh key.
It is needed for vagrant tests.

Now I am thinking, it might be better to rename the key to a descriptive name, e.g. vagrant-insecure-key.

make FLATCAR_CHANNEL="$channel" FLATCAR_VERSION="$release" \
build-${CAPI_PROVIDER}-flatcar
run_vagrant
elif [[ ${CAPI_PROVIDER} = "aws" ]] || [[ ${CAPI_PROVIDER} = "ami" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any plans to add the same support for azure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can add Azure support in the future.
Before that, I would like to get some feedback from maintainers about the current PR, to make sure we are doing in the right direction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the ssh key file to vagrant-insecure-key.

Copy link
Contributor

@codenrhoden codenrhoden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, a lot to go over here, so I'm sure it will take a few rounds and some experimentation...

This is exciting to see Flatcar come in, and I'm really happy to see it making progress. I think the main stumbling blocks i see is the stuff around setting gather_facts to False for Ansible, and trying to set some default facts directly. As this PR currently stands, it's going to break the majority of existing builds (it will break all OVAs, All AMIs, and basically everything but Flatcar as far as I can tell). So that's the big thing we'll have to sort out.

On the templating side, there are several that are copied with a flatcar specific version that I think can remain as a single template and just use conditional to avoid having to change things int wo places.

Seeing this work is making me raise the priority of some other changes that are planned, that will make adding additional OS's to the AMI and QEMU builders easier. Thanks for this!

LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned that Flatcar ships with containerd. Is there not already a unit file present for it? I'm concerned about there being confusion or conflicts between this and the additional drop-ins that are put in place today by

- name: Creates unit file directory
file:
path: /etc/systemd/system/containerd.service.d
state: directory
- name: Create containerd boot order drop in file
template:
dest: /etc/systemd/system/containerd.service.d/boot-order.conf
src: etc/systemd/system/containerd.service.d/boot-order.conf
- name: Create containerd memory pressure drop in file
template:
dest: /etc/systemd/system/containerd.service.d/memory-pressure.conf
src: etc/systemd/system/containerd.service.d/memory-pressure.conf
- name: Create containerd max tasks drop in file
template:
dest: /etc/systemd/system/containerd.service.d/max-tasks.conf
src: etc/systemd/system/containerd.service.d/max-tasks.conf

I already see one example where a specific setting is set twice (TasksMax).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned that Flatcar ships with containerd. Is there not already a unit file present for it? I'm concerned about there being confusion or conflicts between this and the additional drop-ins that are put in place today...

Right. Flatcar already has built-in containerd and docker provided by torcx. It basically means that after boots it is not possible to modify any torcx artifacts, including the containerd's systemd unit file as well as the containerd config. However, we need to download containerd binary files, modify its config, so the new containerd points to the built-in gRPC socket provided by Docker. To do that, we need to install a new containerd unit file that points to a new containerd config that has path to the new containerd socket.

I admit it could look ugly, but I was not able to find a better solution.

@dongsupark
Copy link
Member Author

@codenrhoden Thanks a lot for your thorough review.
Will try to address the issues.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2020
dongsupark pushed a commit to kinvolk/image-builder that referenced this pull request Dec 1, 2020
We should be able to configure path to containerd CRI socket file,
`containerd_cri_socket`, for each distro. Its main use case is to
support distros with different CRI socket path, for example,
`/run/docker/libcontainerd/container.sock` for Flatcar. By default it
is `/var/run/containerd/containerd.sock` as it has been in all other
distros.

This PR is split out from
kubernetes-sigs#248, as suggested
by codenrhoden.
@dongsupark dongsupark force-pushed the dongsu/capi-flatcar branch 2 times, most recently from 037b956 to d56f11b Compare December 3, 2020 16:27
@dongsupark
Copy link
Member Author

@codenrhoden Rebased the PR. Please have a look.

Dongsu Park and others added 10 commits December 7, 2020 13:15
This is a cluster-API image builder for
[Flatcar Container Linux](https://www.flatcar-linux.org).

On the Packer side it includes `packer/qemu/flatcar.json` and
`packer/qemu/packer.json`. In `packer/config/` it also includes some
other JSON files with configuration for ansible and packages installed
with Ansible.

On the Ansible side there's some preliminary playbook in `ansible/`.
This change adds support for building all channels of Flatcar
Container Linux. It also fixes an issue that made the build fail
when make was called directly instead of using the wrapper script
hack/image-build-flatcar.sh.

Define an env variable `CAPI_PRIVIDER`, which is `aws`, `ami` or `qemu`.
Default is `qemu`.

Run vagrant commands only when the provider is qemu.
If it is `aws` or `ami`, run `make build-ami-default`.
Otherwise fail.
We do not need to necessarily install and unpack cri-containerd tarball,
if we have containerd 1.3.
Simply create symlinks to existing binaries from `/usr/bin` to
`/opt/bin`, and set correct paths to them.

Note, it only works for containerd 1.3, current Beta (2605.3.0) and
Alpha (2605.1.0). OTOH stable (2512.4.0) with containerd 1.1 does not
work with this approach yet.
In case of Flatcar, if the given containerd version does not match with
the running containerd version, the image-builder should fail hard.
A version mismatch with the current version.

So it means we need to always set a correct version from Packer side, to
make the containerd installation work for Flatcar.
Also unpack `containerd-shim-runc-v[12]` binaries from the
cri-containerd-cni tarball, because Flatcar by default does not
provide the runc shim binaries. Without the shims, containerd does not
work correctly.

Also We need to create {{sysusr_prefix}}/bin before unpacking containerd
tarballs, to avoid issues happening when unpacking containerd tarballs
into non-existent directories.
We should be able to set Flatcar channels and versions via
`build_name`, `FLATCAR_CHANNEL`, or `FLATCAR_VERSION`, to use
also non-default channels like alpha.
We should use `build_name` as-is, to avoid issues that can happen
in builds for other distros.
Since we started simply using the `build_name` variable, we do not
need to have `only` directives for checking for Flatcar. Simply remove
the `only` directives.
We should run bootstrap-flatcar.sh only in case of Flatcar.

We need to check the if clauses in `exec_command`.
It is not possible to check for the `BUILD_NAME` in `inline` or
`script`.
We should not embed vagrant insecure key in the PR.
Instead we simply download the keypair from the official vagrant github
repo, to use them for ignition config for Flatcar.

Ignition template file `ignition-builder.json` will include a
pre-defined string `BUILDERSSHAUTHKEY`, which will be then replaced with
the ssh public key given by Packer.
@codenrhoden
Copy link
Contributor

/lgtm

I think this is ready to go! I'll leave this for about 24 hours or so and see if anyone has anything else to say, but I'm happy to merge and iterate. Huge effort @dongsupark!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 7, 2020
@codenrhoden
Copy link
Contributor

/approve

🎉 🚀

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: codenrhoden, dongsupark

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 9, 2020
@dongsupark
Copy link
Member Author

/retest

@k8s-ci-robot
Copy link
Contributor

@dongsupark: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jsturtevant
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Dec 10, 2020
@k8s-ci-robot k8s-ci-robot merged commit fda3464 into kubernetes-sigs:master Dec 10, 2020
@dongsupark dongsupark deleted the dongsu/capi-flatcar branch December 11, 2020 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.