OCPCLOUD-2060 Merge https://github.com/kubernetes/autoscaler:master (d3ec0c4) into master #256

cloud-team-rebase-bot · 2023-05-09T21:26:27Z

No description provided.

Check min size of node group and resource limits for set of nodes

* Added GetNodeGpuConfig to cloud provider which returns a GpuConfig struct containing the gpu label, type and resource name if the node has a GPU. * Added initial implementaion of the GetNodeGpuConfig to all cloud providers.

…r7g instances

* Changed the `utilization.Calculate()` function to use GpuConfig instead of GPU label. * Started using GpuConfig in utilization threshold calculations.

Add GpuConfig to cloud provider. Use GpuConfig in utilization calculations.

regenerate the ec2 instance types using latest metadata to fetch m7g/r7g instances

…eCreatedNodesWithErrors

…le down candidate

Fix RemovableAt()

This change removes an `if` statement that was left behind after a refactor. The test in question has the same logic embedded into a previous conditional and the removed statement has no effect on the tests.

remove dead code in clusterapi provider tests

… that is not valid Signed-off-by: cpanato <[email protected]>

Signed-off-by: Guangwen Feng <[email protected]>

…er-chart Bump CA chart to 1.24

Update VPA dependency github.com/emicklei/go-restful/v3

…nodes_total metrics * Added the new resource_name field to scaled_up/down_gpu_nodes_total, representing the resource name for the gpu. * Changed metrics registrations to use GpuConfig

update FAQ.md to add version in the pause container image due the latest that is not valid

Fix a minor typo

Add "resource_name" to scaled_up_gpu_nodes_total and scaled_down_gpu_nodes_total metrics

Added support for the AWS Inferentia 2 instance types based on the NeuronCore v2 chip architecture

…ero-with-labels-taints Use annotations to set labels and taints for clusterapi nodegroups

Merge taint utils into one package, make taint modifying methods public

Track PDBRemainingDisruptions in AutoscalingContext

JoelSpeed · 2023-06-05T11:28:16Z

/hold

@elmiko Have HyperShift been notified that their tests are failing on this? Is a discussion open there to make sure we don't break them?

openshift-ci · 2023-06-05T11:29:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elmiko, JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [JoelSpeed,elmiko]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

elmiko · 2023-06-08T13:03:52Z

@JoelSpeed ack, let them know

enxebre · 2023-06-08T13:47:28Z

/test e2e-hypershift

enxebre · 2023-06-08T13:54:55Z

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_kubernetes-autoscaler/256/pull-ci-openshift-kubernetes-autoscaler-master-e2e-hypershift/1664597395830214656/artifacts/e2e-hypershift/run-e2e/artifacts/TestAutoscaling_PreTeardownClusterDump/namespaces/e2e-clusters-f6m5l-example-jz5rp/core/pods/logs/cluster-autoscaler-5bd4b658b5-nfj67-cluster-autoscaler.log

W0602 12:48:11.795940       1 reflector.go:533] k8s.io/client-go/dynamic/dynamicinformer/informer.go:108: failed to list cluster.x-k8s.io/v1beta1, Resource=machinepools: machinepools.cluster.x-k8s.io is forbidden: User "system:serviceaccount:e2e-clusters-f6m5l-example-jz5rp:cluster-autoscaler" cannot list resource "machinepools" in API group "cluster.x-k8s.io" in the namespace "e2e-clusters-f6m5l-example-jz5rp"

I'll update hypershift rbac.

This is needed to let the autoscaler to operate openshift/kubernetes-autoscaler#256 (comment) kubernetes/autoscaler#4676

enxebre · 2023-06-15T06:58:35Z

/test e2e-hypershift

elmiko · 2023-06-15T12:01:57Z

it seems like some of our carry commits got dropped, and i'm not sure why. looking into re-adding them
/hold

thanks to @aleskandro for catching it =)

elmiko · 2023-06-15T12:59:44Z

i think i've fixed the missing commit, see 834cebd

i'll wait for tests to start passing before removing the hold

muraee · 2023-06-15T13:53:42Z

/test e2e-hypershift

elmiko · 2023-06-15T15:27:35Z

gonna keep the hold here while we work out a question with the scale from zero annotations

the upstream annotations for the scale from zero capacity resources is slighty different than the openshift implementation. the largest difference is the addition of a gpu type annotation. openshift does not yet utilize this annotation and thus this patch should be carried until the machineset controllers for the various providers on openshift have been modified to use the new annotations. another important change is the modification of the memory annotation. previously in openshift we expected this value to be a count of memory in Mebibytes. the conversion function and tests have been modified to allow continued openshift operation. this change can be dropped when the annotations in openshift have been updated, the progress for this effort can be followed at https://issues.redhat.com/browse/OCPCLOUD-944

elmiko · 2023-06-27T12:26:39Z

/retest

openshift-ci · 2023-06-27T15:00:40Z

@cloud-team-rebase-bot[bot]: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/git-history	`c74af56`	link	false	`/test git-history`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

dtobolik · 2023-06-29T14:46:21Z

/label qe-approved

elmiko · 2023-06-29T18:23:35Z

/unhold
/lgtm

k8s-ci-robot and others added 30 commits February 14, 2023 05:31

Merge pull request kubernetes#5502 from yaroslava-serdiuk/min-size-fix

b9bbed2

Check min size of node group and resource limits for set of nodes

Add GetNodeGpuConfig to cloud provider

1f646e4

* Added GetNodeGpuConfig to cloud provider which returns a GpuConfig struct containing the gpu label, type and resource name if the node has a GPU. * Added initial implementaion of the GetNodeGpuConfig to all cloud providers.

regenerate the ec2 instance types using latest metadata to fetch m7g/…

d3d52af

…r7g instances

Use GpuConfig in utilization calculations for scale-down

2b602fc

* Changed the `utilization.Calculate()` function to use GpuConfig instead of GPU label. * Started using GpuConfig in utilization threshold calculations.

Merge pull request kubernetes#5459 from hbostan/master

7cba0a0

Add GpuConfig to cloud provider. Use GpuConfig in utilization calculations.

Merge pull request kubernetes#5508 from a7i/amir/aws-7g

19487b0

regenerate the ec2 instance types using latest metadata to fetch m7g/r7g instances

Add support for VMSS Flex

84f748f

Added RBAC Permission to cherryservers.

dc23b9a

fix(*): refresh node instance cache when nodegroup not found in delet…

0d3a642

…eCreatedNodesWithErrors

Decrease node group size only if the node pass resource check for sca…

6b9d55b

…le down candidate

Merge pull request kubernetes#5514 from yaroslava-serdiuk/min-size-fix

2f1c895

Fix RemovableAt()

Added Uniform orchestrationMode in test cases

5119b42

remove dead code in clusterapi provider tests

5bbfcd3

This change removes an `if` statement that was left behind after a refactor. The test in question has the same logic embedded into a previous conditional and the removed statement has no effect on the tests.

Merge pull request kubernetes#5519 from elmiko/capi-remove-deadcode

3141165

remove dead code in clusterapi provider tests

update FQA to add version in the pause container image due the latest…

665af54

… that is not valid Signed-off-by: cpanato <[email protected]>

bump CA chart image to 1.24

655e6f4

Fix a minor typo

ace98cc

Signed-off-by: Guangwen Feng <[email protected]>

Merge pull request kubernetes#5517 from ism-k/bumpup/cluster-autoscal…

274026b

…er-chart Bump CA chart to 1.24

Merge pull request kubernetes#5482 from jbartosik/update-dep

fb9e55b

Update VPA dependency github.com/emicklei/go-restful/v3

Add "resource_name" to scaled_up_gpu_nodes_total and scaled_down_gpu_…

2ea2fb6

…nodes_total metrics * Added the new resource_name field to scaled_up/down_gpu_nodes_total, representing the resource name for the gpu. * Changed metrics registrations to use GpuConfig

Merge pull request kubernetes#5522 from cpanato/update-doc

03861a8

update FAQ.md to add version in the pause container image due the latest that is not valid

Merge pull request kubernetes#5523 from fenggw-fnst/typo

3922f49

Fix a minor typo

Merge pull request kubernetes#5518 from kawych/metrics

c611acd

Add "resource_name" to scaled_up_gpu_nodes_total and scaled_down_gpu_nodes_total metrics

Added support for in2 instance types

63161aa

Added support for the AWS Inferentia 2 instance types based on the NeuronCore v2 chip architecture

added inf2 instance types to ec2 api.go and api-2.json files

ee86ce4

Merge pull request kubernetes#5382 from cnmcavoy/cmcavoy/scale-from-z…

7128367

…ero-with-labels-taints Use annotations to set labels and taints for clusterapi nodegroups

Merge pull request kubernetes#5477 from BigDarkClown/taint

b516e80

Merge taint utils into one package, make taint modifying methods public

Move PDBRemainingDisruptions to interface and rename it

bdf2dbe

Track PDBRemainingDisruptions in AutoscalingContext

43b459b

Merge pull request kubernetes#5497 from BigDarkClown/pdb

e8ba4bf

Track PDBRemainingDisruptions in AutoscalingContext

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 5, 2023

enxebre added a commit to enxebre/hypershift that referenced this pull request Jun 8, 2023

Update autocaler RBAC to accomodate machinepools support added upstream

23c072b

This is needed to let the autoscaler to operate openshift/kubernetes-autoscaler#256 (comment) kubernetes/autoscaler#4676

enxebre mentioned this pull request Jun 8, 2023

Update autocaler RBAC to accomodate machinepools support added upstream openshift/hypershift#2663

Merged

4 tasks

enxebre added a commit to enxebre/hypershift that referenced this pull request Jun 14, 2023

Update autocaler RBAC to accomodate machinepools support added upstream

b864dee

This is needed to let the autoscaler to operate openshift/kubernetes-autoscaler#256 (comment) kubernetes/autoscaler#4676

orenc1 pushed a commit to orenc1/hypershift that referenced this pull request Jun 14, 2023

Update autocaler RBAC to accomodate machinepools support added upstream

e9df021

This is needed to let the autoscaler to operate openshift/kubernetes-autoscaler#256 (comment) kubernetes/autoscaler#4676

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 15, 2023

elmiko force-pushed the rebase-bot-master branch from 834cebd to 175bd5e Compare June 15, 2023 13:04

elmiko force-pushed the rebase-bot-master branch from 175bd5e to c74af56 Compare June 26, 2023 21:18

This was referenced Jun 28, 2023

Fixes for the labels capacity annotation openshift/machine-api-provider-aws#76

Merged

Set upstream labels and fix capability for the arch-aware scale from 0 in Azure openshift/machine-api-provider-azure#66

Merged

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jun 29, 2023

aleskandro mentioned this pull request Jun 29, 2023

MIXEDARCH-280: Arch-aware autoscale from/to zero openshift/cluster-api-actuator-pkg#284

Merged

openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jun 29, 2023

openshift-merge-robot merged commit b597b81 into openshift:master Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPCLOUD-2060 Merge https://github.com/kubernetes/autoscaler:master (d3ec0c4) into master #256

OCPCLOUD-2060 Merge https://github.com/kubernetes/autoscaler:master (d3ec0c4) into master #256

cloud-team-rebase-bot bot commented May 9, 2023

JoelSpeed commented Jun 5, 2023

openshift-ci bot commented Jun 5, 2023

elmiko commented Jun 8, 2023

enxebre commented Jun 8, 2023

enxebre commented Jun 8, 2023 •

edited

Loading

enxebre commented Jun 15, 2023

elmiko commented Jun 15, 2023 •

edited

Loading

elmiko commented Jun 15, 2023

muraee commented Jun 15, 2023

elmiko commented Jun 15, 2023

elmiko commented Jun 27, 2023

openshift-ci bot commented Jun 27, 2023

dtobolik commented Jun 29, 2023

elmiko commented Jun 29, 2023

OCPCLOUD-2060 Merge https://github.com/kubernetes/autoscaler:master (d3ec0c4) into master #256

OCPCLOUD-2060 Merge https://github.com/kubernetes/autoscaler:master (d3ec0c4) into master #256

Conversation

cloud-team-rebase-bot bot commented May 9, 2023

JoelSpeed commented Jun 5, 2023

openshift-ci bot commented Jun 5, 2023

elmiko commented Jun 8, 2023

enxebre commented Jun 8, 2023

enxebre commented Jun 8, 2023 • edited Loading

enxebre commented Jun 15, 2023

elmiko commented Jun 15, 2023 • edited Loading

elmiko commented Jun 15, 2023

muraee commented Jun 15, 2023

elmiko commented Jun 15, 2023

elmiko commented Jun 27, 2023

openshift-ci bot commented Jun 27, 2023

dtobolik commented Jun 29, 2023

elmiko commented Jun 29, 2023

enxebre commented Jun 8, 2023 •

edited

Loading

elmiko commented Jun 15, 2023 •

edited

Loading