Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster #7608

igarcia-sugarcrm · 2019-09-17T20:33:00Z

1. What kops version are you running? The command kops version, will display
this information.
Version 1.13.0

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
Version 1.13.0
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
Adding a node to a cluster results in nodeup to look for Downloading "http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm" which it does not exist anymore due to centos 7.7 release.
5. What happened after the commands executed?
kops tries to boostrap the node but nodeup fails due to pointing to a nonexistent package

6. What did you expect to happen?
New node bootstrapped and joined to the cluster

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

Sep 17 19:59:07  nodeup: I0917 19:59:07.667801    3560 executor.go:103] Tasks: 40 done / 48 total; 1 can run
Sep 17 19:59:07  nodeup: I0917 19:59:07.667844    3560 executor.go:178] Executing task "Package/docker-ce": Package: docker-ce
Sep 17 19:59:07  nodeup: I0917 19:59:07.667883    3560 package.go:206] Listing installed packages: /usr/bin/rpm -q docker-ce --queryformat %{NAME} %{VERSION}
Sep 17 19:59:07 nodeup: I0917 19:59:07.693153    3560 package.go:267] Installing package "docker-ce" (dependencies: [Package: container-selinux])
Sep 17 19:59:07  nodeup: I0917 19:59:07.747296    3560 files.go:100] Hash matched for "/var/cache/nodeup/packages/docker-ce": sha1:5369602f88406d4fb9159dc1d3fd44e76fb4cab8
Sep 17 19:59:07 nodeup: I0917 19:59:07.747368    3560 files.go:103] Hash did not match for "/var/cache/nodeup/packages/container-selinux": actual=sha1:93fdc15d22645b17bb1b2cc652f5bf51924d00a7 vs expected=sha1:d9f87f7f4f2e8e611f556d873a17b8c0c580fec0
Sep 17 19:59:07  nodeup: I0917 19:59:07.747458    3560 http.go:77] Downloading "http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm"
Sep 17 19:59:07  nodeup: I0917 19:59:07.891339    3560 files.go:103] Hash did not match for "/var/cache/nodeup/packages/container-selinux": actual=sha1:93fdc15d22645b17bb1b2cc652f5bf51924d00a7 vs expected=sha1:d9f87f7f4f2e8e611f556d873a17b8c0c580fec0
Sep 17 19:59:07  nodeup: W0917 19:59:07.891385    3560 executor.go:130] error running task "Package/docker-ce" (2m20s remaining to succeed): downloaded from "http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm"
 but hash did not match expected "sha1:d9f87f7f4f2e8e611f556d873a17b8c0c580fec0"

The text was updated successfully, but these errors were encountered:

elisiano · 2019-09-17T20:40:06Z

I'm seeing this as well

eytan-avisror · 2019-09-17T22:16:46Z

We are seeing this issue as well.

Looks like this package was removed from centos repo, returning a 404:

wget http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm
--2019-09-17 15:10:16--  http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm
Resolving mirror.centos.org (mirror.centos.org)... 23.254.0.226
Connecting to mirror.centos.org (mirror.centos.org)|23.254.0.226|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-09-17 15:10:17 ERROR 404: Not Found.

This causes a major issue when considering autoscaling (cluster-autoscaler) which takes down nodes and new ones never join the cluster.

Ideally, for resiliency Kops should not be resolving artifacts required for nodeup/bootstrapping during node runtime from public repos - not sure if this is the way to go but possibly consider placing such critical rpms/binaries in the state store during init and fetching from there during runtime?
Also, if package is already installed (some may choose to bake in their AMI), it should skip trying to fetch this (not sure if this is the current behavior already).

I noticed that the recent container-selinux issue on centos was reporting a hash mismatch rather than a 404. See the error message here: kubernetes#7608 and the "actual" sha1 response is that of the 404 page: ``` curl http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm curl http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm | shasum -a 1 ```

ianlmk · 2019-09-18T05:26:45Z

Experiencing this in a production cluster as well. Is there any way to fast track this?

Added a PR
#7612

rdjy · 2019-09-18T06:49:55Z

A manual workaround is downloading the following file from a working node
/var/cache/nodeup/packages/container-selinux
and upload it to the new node.

Some Centos mirrors sites might still have the old RPM file. see: https://mirror-status.centos.org/

gjtempleton · 2019-09-18T16:16:20Z

This has just bitten us as well, #7609 should resolve it however.

I noticed that the recent container-selinux issue on centos was reporting a hash mismatch rather than a 404. See the error message here: kubernetes#7608 and the "actual" sha1 response is that of the 404 page: ``` curl http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm curl http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm | shasum -a 1 ```

a8j8i8t8 · 2019-09-18T20:01:27Z

@rdjy Thanks for the answer, it did the trick for us.

dojadop · 2019-09-18T23:03:45Z

Now that #7609 is merged how would I be able to leverage this change? Do I have to wait for a new kops release or how is nodeup released?

mikesplain · 2019-09-19T14:37:12Z

We're working on getting a 1.13/1.14 cut with these fixes asap.

You'll either need to build and deploy your own version of kops (including protokube and kubeup), a workaround as suggested above (you can probably utilize a hook to automate it https://github.com/kubernetes/kops/blob/master/docs/cluster_spec.md#hooks) or wait for a release which we're actively working on getting out asap!

alexinthesky · 2019-09-19T16:19:39Z

Hi,

I had no luck using a hook to curl the correct file, as hooks seem to run AFTER nodeup. All I can think of is to build a custom AMI instead of vanillia amazon linux 2.

hrzbrg · 2019-09-19T17:05:25Z

Indeed, hooks won't work. We figured that out the exact same time as @alexinthesky 😂

Then we switched for the Debian AMI to avoid further damage by dying spot instances.
kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16

CarpathianUA · 2019-09-19T19:47:43Z

+1 Seeing the same

elisiano · 2019-09-19T22:23:16Z

it's a bit involved but we found a workaround until a new release is cut (especially for people having this issue in production).
Bottom line is:

create a public s3 bucket and place there a tar with what you need (we did this with all /var/cache/nodeup, it's around 200mb)
copy the current launch configuration of the AutoScalingGroup into a new one (make sure you select the right IP policy based on your topology) and add 1 line in the beginning:
```
 curl https://yourBucket/var_cache_nodeup.tgz | tar -C / -xzf -
```
(adjust the tar path extraction depending how you created your tar)
update the AutoScalingGroup to use the newly created LaunchConfiguration.

This way the cache is there before nodeup is ran.

rdjy · 2019-09-20T07:04:01Z

Below is an improved workaround, inspired by previous comments and pull requests. Kops supports arbitrary userdata. The snippet needs to be added to each instance group spec.

spec:
  additionalUserData:
  - content: |
      bootcmd:
        - mkdir -p /var/cache/nodeup/packages
        - curl --proxy http://my.proxy:3128 -o /var/cache/nodeup/packages/container-selinux http://mirror.centos.org/centos/7.6.1810/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm
    name: workaround-container-selinux
    type: text/cloud-config

dignajar · 2019-09-20T07:15:22Z

Hi,
I just face the same issue recreating one of the masters node.

I connected to the node via ssh and download the package from another URL.

curl http://mirror.centos.org/centos/7.6.1810/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm -o /var/cache/nodeup/packages/container-selinux

bgopalakrishnan1986 · 2019-09-21T11:54:30Z

Was able to workaround the issue by running the below commands on both Master and Nodes

curl http://mirror.centos.org/centos/7.6.1810/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm -o /var/cache/nodeup/packages/container-selinux
yum install -y selinux-policy selinux-policy-base selinux-policy-targeted

vobrien-axway · 2019-09-25T11:47:59Z

This workaround no longer works. As of today http://mirror.centos.org/centos/7.6.1810/ has been deprecated. This also breaks the fix that went in kops 1.13.1: #7609

As a workaround you can use http://vault.centos.org/7.6.1810/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm

But really contianer-selinux needsto be updated to http://mirror.centos.org/centos/7/extras/x86_64/Packages/container-selinux-2.107-3.el7.noarch.rpm along with associated dependencies

justinsb · 2019-09-25T13:31:40Z

OK so looks like we'll be doing 1.13.2 this morning. I'd also really prefer to get away from the OS packaging (towards "tar.gz" installation) as it seems to be introducing more problems than it solves.

For 2.68.1 -> 2.107.3: We try not to make potentially breaking changes once we have released the 1.x.0 of kops. But we do so for security fixes etc. So we can look at getting it into 1.14.0 (which hasn't quite released yet). But is it a security fix (in which case we would get it into 1.13.0)?

justinsb · 2019-09-25T14:03:49Z

Here's the changelog, looks like there's not a strict security fix vs feature distinction, so we should probably shouldn't introduce the new version in kops 1.13:

* Fri Aug 02 2019 Jindrich Novy <[email protected]> - 2:2.107-3
- use 2.107 in RHEL7u7
- add build.sh script

* Thu Jul 11 2019 Lokesh Mandvekar <[email protected]> - 2:2.107-2
- Resolves: #1626215

* Mon Jun 24 2019 Lokesh Mandvekar <[email protected]> - 2:2.107-1
- bump to v2.107

* Tue Apr 23 2019 Lokesh Mandvekar <[email protected]> - 2:2.99-1
- built commit b13d03b

* Tue Apr 02 2019 Frantisek Kluknavsky <[email protected]> - 2:2.95-2
- rebase

* Thu Feb 28 2019 Frantisek Kluknavsky <[email protected]> - 2:2.84-2
- rebase

* Tue Jan 08 2019 Frantisek Kluknavsky <[email protected]> - 2.77-1
- backported fixes from upstream

* Mon Nov 12 2018 Dan Walsh <[email protected]> - 2.76-1
- Allow containers to use fuse file systems by default
- Allow containers to sendto dgram socket of container runtimes
- Needed to run container runtimes in notify socket unit files.

* Fri Oct 19 2018 Dan Walsh <[email protected]> - 2.74-1
- Allow containers to setexec themselves

* Tue Sep 18 2018 Frantisek Kluknavsky <[email protected]> - 2:2.73-3
- tweak macro for fedora - applies to rhel8 as well

* Mon Sep 17 2018 Frantisek Kluknavsky <[email protected]> - 2:2.73-2
- moved changelog entries:
- Define spc_t as a container_domain, so that container_runtime will transition
to spc_t even when setup with nosuid.
- Allow container_runtimes to setattr on callers fifo_files
- Fix restorecon to not error on missing directory

* Thu Sep 06 2018 Dan Walsh <[email protected]> - 2.69-3
- Make sure we pull in the latest selinux-policy

* Wed Jul 25 2018 Dan Walsh <[email protected]> - 2.69-2
- Add map support to container-selinux for RHEL 7.5
- Dontudit attempts to write to kernel_sysctl_t

nigeldunn · 2019-09-26T05:45:08Z

Can the packages be externalised into a yaml/json file that nodeup reads in instead of being compiled into the binary? That would enable people to source the rpm and store it locally (s3, cloud storage, etc).

I've opted to save the rpm in S3 and then add it into kops with this in the instance groups:

spec:
  additionalUserData:
  - content: |
      bootcmd:
        - mkdir -p /var/cache/nodeup/packages
        - aws s3 cp s3://<my-s3-bucket>/container-selinux /var/cache/nodeup/packages/container-selinux
    name: workaround-container-selinux
    type: text/cloud-config

Then you just need to sort out the bucket policy and iam privileges for kops to read from the bucket. This is in an AWS environment obviously, I'm sure there are similar approaches for the other cloud platforms.

fejta-bot · 2019-12-25T06:36:31Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-01-24T07:21:50Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-02-23T08:03:32Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-02-23T08:03:41Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jbdoto · 2022-03-01T15:32:07Z

Hi Everyone,

Our team encountered this issue yesterday on a Kops 1.14.8 cluster we have, related to this vault.centos.org issue.

We had previously successfully used this fix on an older cluster, but we had an issue using the bootcmd approach detailed in that comment. We ended up using the following approach in an additionalUserData stanza:

  additionalUserData:
    - name: initialize-cache.sh
      type: text/x-shellscript
      content: |
        #!/bin/sh
        ( mkdir -p /var/cache/nodeup/packages && curl -o /var/cache/nodeup/packages/container-selinux https://mirror.rackspace.com/centos-vault/centos/7.6.1810/extras/x86_64/Packages/container-selinux-2.68-1.el7.noarch.rpm )

For full disclosure, our actual fix pointed to our company's internal yum repo, so if you have the ability to do that, it's probably a better solution than relying on a public mirror.

Hope this helps save everyone else some pain!

igarcia-sugarcrm changed the title ~~Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap and add a new node to a cluster~~ Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster Sep 17, 2019

rifelpet mentioned this issue Sep 17, 2019

Check the HTTP response code when downloading URLs #7611

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 25, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 24, 2020

k8s-ci-robot closed this as completed Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster #7608

Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster #7608

igarcia-sugarcrm commented Sep 17, 2019 •

edited

Loading

elisiano commented Sep 17, 2019

eytan-avisror commented Sep 17, 2019 •

edited

Loading

ianlmk commented Sep 18, 2019 •

edited

Loading

rdjy commented Sep 18, 2019

gjtempleton commented Sep 18, 2019

a8j8i8t8 commented Sep 18, 2019

dojadop commented Sep 18, 2019

mikesplain commented Sep 19, 2019

alexinthesky commented Sep 19, 2019

hrzbrg commented Sep 19, 2019

CarpathianUA commented Sep 19, 2019

elisiano commented Sep 19, 2019

rdjy commented Sep 20, 2019

dignajar commented Sep 20, 2019

bgopalakrishnan1986 commented Sep 21, 2019

vobrien-axway commented Sep 25, 2019

justinsb commented Sep 25, 2019

justinsb commented Sep 25, 2019

nigeldunn commented Sep 26, 2019 •

edited

Loading

fejta-bot commented Dec 25, 2019

fejta-bot commented Jan 24, 2020

fejta-bot commented Feb 23, 2020

k8s-ci-robot commented Feb 23, 2020

jbdoto commented Mar 1, 2022 •

edited

Loading

Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster #7608

Nodeup can't find container-selinux-2.68-1.el7.noarch.rpm when trying to bootstrap a new node to a cluster #7608

Comments

igarcia-sugarcrm commented Sep 17, 2019 • edited Loading

elisiano commented Sep 17, 2019

eytan-avisror commented Sep 17, 2019 • edited Loading

ianlmk commented Sep 18, 2019 • edited Loading

rdjy commented Sep 18, 2019

gjtempleton commented Sep 18, 2019

a8j8i8t8 commented Sep 18, 2019

dojadop commented Sep 18, 2019

mikesplain commented Sep 19, 2019

alexinthesky commented Sep 19, 2019

hrzbrg commented Sep 19, 2019

CarpathianUA commented Sep 19, 2019

elisiano commented Sep 19, 2019

rdjy commented Sep 20, 2019

dignajar commented Sep 20, 2019

bgopalakrishnan1986 commented Sep 21, 2019

vobrien-axway commented Sep 25, 2019

justinsb commented Sep 25, 2019

justinsb commented Sep 25, 2019

nigeldunn commented Sep 26, 2019 • edited Loading

fejta-bot commented Dec 25, 2019

fejta-bot commented Jan 24, 2020

fejta-bot commented Feb 23, 2020

k8s-ci-robot commented Feb 23, 2020

jbdoto commented Mar 1, 2022 • edited Loading

igarcia-sugarcrm commented Sep 17, 2019 •

edited

Loading

eytan-avisror commented Sep 17, 2019 •

edited

Loading

ianlmk commented Sep 18, 2019 •

edited

Loading

nigeldunn commented Sep 26, 2019 •

edited

Loading

jbdoto commented Mar 1, 2022 •

edited

Loading