-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to rhel-coreos (9) #3596
Conversation
When we move from RHCOS 8 -> RHCOS 9, the SSH keys are not being written to the new location because: 1. When the upgrade configs are written to the node, it is still running RHCOS 8, so the keys are not being written to the new location. 2. The node reboots into RHCOS 9 to complete the upgrade. 3. The "are we on the latest config" functions detect that we are indeed on the latest config and so it does not attempt to perform an update.
ref: https://issues.redhat.com/browse/COS-1983 We introduced a new `rhel-coreos` that is RHEL 9 to aid having a switch be an atomic operation. After design discussion we realized it's easier to have an "unversioned" image though, so this drops the `-8`.
Unfortunately rpm-ostree requires this right now; we have an issue and code to provide a better API in coreos/rpm-ostree#2542 But using that will require shipping the updated rpm-ostree in RHEL 8.6.z or at least OCP 4.12.z, which is problematic. Because we know the new MCD will always be upgrading to RHEL9, for now let's update this hardcoded list. In the future we can detect when the running host has `--remove-installed-kernel` and use it instead.
Rapid file changes triggering the path unit can start the service here frequently, and then this can cause the start limit to be hit, and then systemd will refuse further activations (unless we bumped the limit). I don't think we need to synchronize the iptables rules more than once every 3 seconds.
/lgtm |
@jupierce lets do this |
@cgwalters: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, sdodson The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
xref 1ad53a7 and the rename in openshift/machine-config-operator#3596
/cherry-pick release-4.13 |
@cgwalters: new pull request created: #3603 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Forking from #3485
This version of the PR uses
rhel-coreos
, notrhel-coreos-9
per discussion.ensures that RHCOS 9 SSH keys are in the right place
OKD release controller is out-of-date
ensures SSH keys get moved to the correct location
When we move from RHCOS 8 -> RHCOS 9, the SSH keys are not being written
to the new location because:
teaches TestIgn3Cfg about the new RHCOS 9 key path
checks perms for SSH key path dirs as well
Switch to rhel-coreos (9)
ref: https://issues.redhat.com/browse/COS-1983
We introduced a new
rhel-coreos
that is RHEL 9 to aid having a switch bean atomic operation. After design discussion we realized it's easier
to have an "unversioned" image though, so this drops the
-8
.daemon: Also override
kernel-modules-core
Unfortunately rpm-ostree requires this right now; we have an issue
and code to provide a better API in coreos/rpm-ostree#2542
But using that will require shipping the updated rpm-ostree in RHEL 8.6.z
or at least OCP 4.12.z, which is problematic.
Because we know the new MCD will always be upgrading to RHEL9,
for now let's update this hardcoded list. In the future we can
detect when the running host has
--remove-installed-kernel
anduse it instead.
openshift-azure-routes: Avoid synchronizing too quickly
Rapid file changes triggering the path unit can start the
service here frequently, and then this can cause the start
limit to be hit, and then systemd will refuse further
activations (unless we bumped the limit).
I don't think we need to synchronize the iptables
rules more than once every 3 seconds.