Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial pass at populating the os version label #514

Merged

Conversation

rphillips
Copy link
Contributor

@rphillips rphillips commented Mar 1, 2019

- What I did

Enumerates the OS version from /etc/os-release and creates a node.openshift.io/os_version={OS_VERSION} label.

- How to verify it
oc get node [node name] -o yaml

- Description for the changelog
Add an initial OS Version to label (node.openshift.io/os_version={OS_VERSION}) to nodes.

- Some background info
Kubelet --node-labels only get created on kubelet registration, so changes will need to happen with a daemonset. Liggit has a number of comments in upstream saying the kubelet should not update labels, so a custom patch would probably never be merged upstream (IMO). There are differing opinions on the ticket.

This PR is phase 1) create the label with the starting version. The second phase will be to create a daemonset to update the label on RHEL7->RHEL8 upgrades. A final third phase, when RHEL7 is not supported anymore, we can retire the daemonset and delete the code/daemonset.

/cc @jeremyeder @sjenning @smarterclayton @derekwaynecarr

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 1, 2019
Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments in line.

Also is there an open issue somewhere for this? Also since there is a bug fix is there approval to put this in 4.0 because the MCO is putting all new enchancements/features into our master-4.1 branch at this time.

cc: @runcom

@kikisdeliveryservice
Copy link
Contributor

Until there is confirmation that this should go into 4.0

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 1, 2019
@rphillips rphillips changed the base branch from master to master-4.1 March 1, 2019 02:09
@rphillips rphillips force-pushed the feat/add_version_label branch from ab33c57 to 40755e2 Compare March 1, 2019 02:11
@rphillips
Copy link
Contributor Author

@kikisdeliveryservice changed to the 4.1 branch

@rphillips
Copy link
Contributor Author

There has been some discussion on aos-devel regarding how networking can figure out if the underlying OS is RHEL7 or RHEL8.

@kikisdeliveryservice
Copy link
Contributor

Thanks @rphillips !
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 1, 2019
@rphillips rphillips force-pushed the feat/add_version_label branch from 40755e2 to e4f2339 Compare March 1, 2019 02:31
@kikisdeliveryservice kikisdeliveryservice self-requested a review March 1, 2019 02:40
@kikisdeliveryservice kikisdeliveryservice dismissed their stale review March 1, 2019 02:42

changes made as requested, will allow others to review.

@derekwaynecarr
Copy link
Member

this is only needed for 4.1. In 4.0, we know a node not labeled is rhel7, and in 4.1, pods that require rhel8 will have nodeSelectors that will hold scheduling until a label is present. In 4.1, rhel7 workers are new nodes, and would have newly registered labels which should be fine with general approach.

@@ -20,7 +20,7 @@ contents: |
--container-runtime=remote \
--container-runtime-endpoint=/var/run/crio/crio.sock \
--allow-privileged \
--node-labels=node-role.kubernetes.io/worker \
--node-labels=node-role.kubernetes.io/worker,node.openshift.io/os_version=${OS_VERSION} \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just add /etc/os-release as an EnvironmentFile and pull it out from there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that does simplify things. fixed!

@rphillips rphillips force-pushed the feat/add_version_label branch from e4f2339 to 44f321c Compare March 1, 2019 14:38
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 1, 2019
@sdodson
Copy link
Member

sdodson commented Mar 5, 2019

Should we add similar support so that we can differentiate between RHCOS and RHEL Server or other?

@rphillips
Copy link
Contributor Author

@sdodson which variable would you like to add from os-version?

@ashcrow
Copy link
Member

ashcrow commented Mar 6, 2019

This looks good to me. Just to remind folks, you'll end up getting a version like this on RHCOS:

400.7.20190301.0

Related:

@sdodson
Copy link
Member

sdodson commented Mar 6, 2019

The variable ID from /etc/os-release seems like a good source for the value but we probably want to name the label differently. os_id perhaps?

@rphillips
Copy link
Contributor Author

@sdodson My general thought was to have os_version correspond to ${VERSION} within the os-release. If we want to add more variables from os-release, then they might have a 1 to 1 mapping. os_id would map to ${ID} within os-release.

@sdodson
Copy link
Member

sdodson commented Mar 7, 2019

Yes, that's what I meant, an additional label for os_id.

@rphillips
Copy link
Contributor Author

rphillips commented Mar 7, 2019

I think this PR is ready for approval unless there are any further comments.

Edit: I can add that id in.

@rphillips
Copy link
Contributor Author

@sdodson added the ID

@rphillips
Copy link
Contributor Author

/retest

2 similar comments
@rphillips
Copy link
Contributor Author

/retest

@rphillips
Copy link
Contributor Author

/retest

@rphillips rphillips changed the base branch from master-4.1 to master March 7, 2019 20:04
@rphillips
Copy link
Contributor Author

@kikisdeliveryservice after talking to @sjenning, we need this in 4.0.

I've switched the target branch back to master.

@kikisdeliveryservice
Copy link
Contributor

thanks for the update @rphillips

Could you amend the body of the commit messages to add a little more info? For ex, os version using: "Enumerates the OS version from /etc/os-release and creates a node.openshift.io/os_version={OS_VERSION} label." and something substantive for the id label commit.

@kikisdeliveryservice
Copy link
Contributor

@runcom @ashcrow @cgwalters

Any comments on this PR?

Some operators will need to know the underlying operating system version
to work (ie: networking, etc). This information is especially needed if
nodes are upgraded from RHEL7 to RHEL8.

The kubelet adds labels when it registers for the first time. There
will be changes in the machine-config-daemon or NFD to support
relabelling the node when the OS has upgraded.
@rphillips rphillips force-pushed the feat/add_version_label branch from 5795ac9 to e781673 Compare March 7, 2019 21:04
@rphillips
Copy link
Contributor Author

@kikisdeliveryservice updated the commit message and comment.

@kikisdeliveryservice
Copy link
Contributor

kikisdeliveryservice commented Mar 7, 2019

thanks @rphillips !

This looks good. Can verify that labels were created for each node. Checking e2e-aws/artifacts/nodes.json. For ex:

                "labels": {
                    "beta.kubernetes.io/arch": "amd64",
                    "beta.kubernetes.io/instance-type": "m4.xlarge",
                    "beta.kubernetes.io/os": "linux",
                    "failure-domain.beta.kubernetes.io/region": "us-east-1",
                    "failure-domain.beta.kubernetes.io/zone": "us-east-1b",
                    "kubernetes.io/hostname": "ip-10-0-154-8",
                    "node-role.kubernetes.io/master": "",
                    "node.openshift.io/os_id": "rhcos",
                    "node.openshift.io/os_version": "400.7.20190306.0"

@runcom runcom added the 4.0 label Mar 7, 2019
@runcom
Copy link
Member

runcom commented Mar 7, 2019

code lgtm, I'd love @sjenning to approve

@sjenning
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 12, 2019
@kikisdeliveryservice
Copy link
Contributor

/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kikisdeliveryservice, rphillips, sjenning

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [kikisdeliveryservice]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 12, 2019
@openshift-merge-robot openshift-merge-robot merged commit 2c07ed4 into openshift:master Mar 12, 2019
@vrutkovs
Copy link
Member

/test e2e-rhel-scaleup

@@ -19,7 +20,7 @@ contents: |
--container-runtime=remote \
--container-runtime-endpoint=/var/run/crio/crio.sock \
--allow-privileged \
--node-labels=node-role.kubernetes.io/master \
--node-labels=node-role.kubernetes.io/master,node.openshift.io/os_version=${VERSION},node.openshift.io/os_id=${ID} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks RHEL scaleup:

kubelet_node_status.go:92] Unable to register node "ip-10-0-156-42.ec2.internal" with API server: Node "ip-10-0-156-42.ec2.internal" is invalid: metadata.labels: Invalid value: "7 (Core)": a valid label must be an empty string or consist of alphanumeric characters, ''-'', ''_'' or ''.'', and must start and end with an alphanumeric character (e.g. ''MyValue'',  or ''my_value'',  or ''12345'', regex used for validation is ''(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?'')'

cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request Apr 23, 2019
This removes the functionality added in
openshift#514
which I don't think really worked for what people needed; a node
selector can't parse the value.

All we care about is currently 7 or 8; this signals things like
`iptables` vs `nftables`.

And the presence of this label also signals "RHEL" too;
though we keep the `os_id` label in case e.g. a daemonset should only
run on RHCOS (or traditional RHEL).

Closes: openshift#582
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.