Hard-coded server type in Hetzner provider will break on 2024-09-06 #7210

apricote · 2024-08-27T07:00:30Z

Which component are you using?:

cluster-autoscaler Hetzner provider

/area provider/hetzner
/area cluster-autoscaler

What version of the component are you using?:

Component version: All current versions

What k8s version are you using (kubectl version)?:

Does not matter

What environment is this in?:

Hetzner Cloud

What did you expect to happen?:

The Hetzner Cloud provider should continue to work after 2024-09-06.

What happened instead?:

The Hetzner Cloud provider will stop working on 2024-09-06.

How to reproduce it (as minimally and precisely as possible):

Replace the hardcoded server type (code) with a server type that does not exist (ie. xyz123)
Start the provider

Observe error messages:

mixed_nodeinfos_processor.go:160] Unable to build proper template node for draining-node-pool: failed to create resource list for node group draining-node-pool error: failed to get machine type xyz123 info error: server type not found
static_autoscaler.go:387] Failed to get node infos for groups: failed to create resource list for node group draining-node-pool error: failed to get machine type xyz123 info error: server type not found

Anything else we need to know?:

The server type cx11 was deprecated on 2024-06-06. It will be removed from the API on 2024-09-06: https://docs.hetzner.cloud/changelog#2024-06-06-old-server-types-with-shared-intel-vcpus-are-deprecated

The server type is hardcoded for a draining-node-pool, which is not actually used anywhere in the provider. It is only added to the list of known node pools.

Two options:

Replace cx11 by the replacement type cx22

This is minimally invasive, but has the same problem that we are hardcoding a value that might change or be deprecated.
Remove draining-node-pool completely from the code

This feels like the clean choice, as this node pool is not used internally. However, this is a user visible change (node pool will disappear from the status config map), so I am not sure if we can backport this to previous releases.

The text was updated successfully, but these errors were encountered:

apricote · 2024-08-27T07:08:02Z

/assign

…type The `cx11` server type was deprecated on 2024-06-06 and will be removed from the API on 2024-09-06. Once it is removed, the cluster-autoscaler provider hetzner will not start anymore with the following error message: Failed to get node infos for groups: failed to create resource list for node group draining-node-pool error: failed to get machine type cx11 info error: server type not found As the node pool `draining-node-pool` is not being used anywhere, this commit removes it and the hard coded reference to the deprecated server type. Fixes kubernetes#7210

Shubham82 · 2024-08-27T13:21:22Z

/triage accepted

apricote · 2024-08-27T13:25:17Z

Update from internal conversations:

We plan to implement a workaround for known current users of cluster-autoscaler. This will still cause issues for new users unless a new version of cluster-autoscaler is released.

The workaround will only be available for ~2 weeks after new releases are cut.

We will inform impacted customers about this so they can update before it starts breaking.

apricote · 2024-08-30T10:47:03Z

Information for customers

The Hetzner provider in current versions of cluster-autoscaler has a bug and relies on the CX11 server type, which we will remove from our ordering options on 6 September 2024.

If you try to use the cluster-autoscaler provider after that date, you will see the following error messages:

mixed_nodeinfos_processor.go:160] Unable to build proper template node for draining-node-pool: failed to create resource list for node group draining-node-pool error: failed to get machine type cx11 info error: server type not found
static_autoscaler.go:387] Failed to get node infos for groups: failed to create resource list for node group draining-node-pool error: failed to get machine type cx11 info error: server type not found

The following versions of cluster-autoscaler are affected:

≤1.28.6 (including 1.27 and older)
≤1.29.4
≤1.30.2
≤1.31.0

We depend on the Kubernetes community and the maintainers of cluster-autoscaler to release new versions. We expect that new official versions are released at the end of September.

To bridge the gap until the Kubernetes community releases the new versions, we published alternative container images of cluster-autoscaler that include a patch for the bug. You can use these in your deployment, but we will remove them one month after new official cluster-autoscaler versions become available. We will not provide any other patch releases on this container image repository. Please switch back to the official images as soon as possible.

docker.io/hetznercloud/cluster-autoscaler:v1.28.6-hcloud1 (Build Commit)
docker.io/hetznercloud/cluster-autoscaler:v1.29.4-hcloud1 (Build Commit)
docker.io/hetznercloud/cluster-autoscaler:v1.30.2-hcloud1 (Build Commit)
docker.io/hetznercloud/cluster-autoscaler:v1.31.0-hcloud1 (Build Commit)

Existing Users

To prevent disruptions for existing users of the provider, we will keep the CX11 server type available for these accounts. We will remove that prolonged access to the CX11 server type two weeks after the Kubernetes community releases new versions of cluster-autoscaler.

Links

apricote · 2024-09-23T07:47:49Z

Backports to current release branches:

apricote · 2024-10-21T09:22:52Z

Information for customers

The fix has now been released in new versions of cluster-autoscaler. Please upgrade to the these versions before 4 November 2024.

The following versions include the patches:

≥ 1.28.7
≥ 1.29.5
≥ 1.30.3 (Edit: Previous version mentioned 1.30.2, but this was wrong and does not contain the fix.)
≥ 1.31.1

We will remove all access to the CX11 server type after 4 November 2024. If you are still using unpatched versions of cluster-autoscaler, they will stop working.

If you were using the alternative container images we provided (docker.io/hetznercloud/cluster-autoscaler), please switch back to the images built by the Kubernetes community (registry.k8s.io/autoscaling/cluster-autoscaler). We will remove the alternative container images on 19 November 2024.

We will be happy to help you with any questions. Please write us a support request by logging onto your account: https://console.hetzner.cloud/support

Shubham82 · 2024-10-21T09:32:19Z

Thanks @apricote for this information.

Can we mention this information under Hetzner README.md as a Note, so that users will get the information about it.

WDYT?

apricote · 2024-10-21T10:11:07Z

We also sent the above as an email to all active users of cluster-autoscaler on Hetzner Cloud. I think that should be enough. Anyone newly installing cluster-autoscaler should hopefully use the latest patch releases.

Shubham82 · 2024-10-21T10:18:11Z

We also sent the above as an email to all active users of cluster-autoscaler on Hetzner Cloud. I think that should be enough. Anyone newly installing cluster-autoscaler should hopefully use the latest patch releases.

ok @apricote, I didn't know about the email things. it seems good to me then.

Thanks!

btribit · 2024-11-06T21:38:44Z

Is there any chance we can get this tagged and release? As of 1.31.0 this change is not in place yet.

Shubham82 · 2024-11-07T08:37:00Z

It should be in the CA 1.31 patch release (1.31.1). which is not yet release.
It will be released soon.

For information: #7315

apricote · 2024-11-11T15:51:43Z

The container image was already built and published/promoted, only the tag on the GitHub repository is still missing, but you can use the tag as is.

Shubham82 · 2024-11-19T06:42:58Z

Hi @btribit, FYI: The patch release for CA 1.31 (CA 1.31.1) has been released, PTAL!

btribit · 2024-11-19T15:06:18Z

@Shubham82 , works like a champ! Found it in the registry too. Thank you!

apricote added the kind/bug Categorizes issue or PR as related to a bug. label Aug 27, 2024

k8s-ci-robot added area/provider/hetzner Issues or PRs related to Hetzner provider area/cluster-autoscaler labels Aug 27, 2024

k8s-ci-robot assigned apricote Aug 27, 2024

apricote mentioned this issue Aug 27, 2024

fix(hetzner): deprecated server type will break on 2024-09-06 #7211

Merged

k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Aug 27, 2024

emrys90 mentioned this issue Aug 30, 2024

Warning from Hetzner about cluster-autoscaler vitobotta/hetzner-k3s#429

Closed

WebSpider mentioned this issue Aug 30, 2024

[Bug]: Cluster autoscaler will break after 6/9 kube-hetzner/terraform-hcloud-kube-hetzner#1465

Closed

k8s-ci-robot closed this as completed in #7211 Sep 23, 2024

jampy mentioned this issue Oct 21, 2024

Autoscaler needs to be upgraded to newer version until November 6th vitobotta/hetzner-k3s#470

Closed

M4t7e mentioned this issue Nov 6, 2024

Support for Auto scaling hcloud-k8s/terraform-hcloud-kubernetes#4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hard-coded server type in Hetzner provider will break on 2024-09-06 #7210

Hard-coded server type in Hetzner provider will break on 2024-09-06 #7210

apricote commented Aug 27, 2024 •

edited

Loading

apricote commented Aug 27, 2024

Shubham82 commented Aug 27, 2024

apricote commented Aug 27, 2024

apricote commented Aug 30, 2024

apricote commented Sep 23, 2024

apricote commented Oct 21, 2024 •

edited

Loading

Shubham82 commented Oct 21, 2024

apricote commented Oct 21, 2024

Shubham82 commented Oct 21, 2024

btribit commented Nov 6, 2024

Shubham82 commented Nov 7, 2024

apricote commented Nov 11, 2024

Shubham82 commented Nov 19, 2024

btribit commented Nov 19, 2024

Hard-coded server type in Hetzner provider will break on 2024-09-06 #7210

Hard-coded server type in Hetzner provider will break on 2024-09-06 #7210

Comments

apricote commented Aug 27, 2024 • edited Loading

apricote commented Aug 27, 2024

Shubham82 commented Aug 27, 2024

apricote commented Aug 27, 2024

apricote commented Aug 30, 2024

Information for customers

Existing Users

Links

apricote commented Sep 23, 2024

apricote commented Oct 21, 2024 • edited Loading

Information for customers

Shubham82 commented Oct 21, 2024

apricote commented Oct 21, 2024

Shubham82 commented Oct 21, 2024

btribit commented Nov 6, 2024

Shubham82 commented Nov 7, 2024

apricote commented Nov 11, 2024

Shubham82 commented Nov 19, 2024

btribit commented Nov 19, 2024

apricote commented Aug 27, 2024 •

edited

Loading

apricote commented Oct 21, 2024 •

edited

Loading