Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix imports in cluster autoscaler after migrating it from contrib #1

Merged
merged 1 commit into from
Apr 18, 2017

Conversation

mwielgus
Copy link
Contributor

@mwielgus mwielgus requested a review from MaciekPytel April 18, 2017 13:48
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 18, 2017
@MaciekPytel
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 18, 2017
@mwielgus
Copy link
Contributor Author

All test passed,

?   	k8s.io/autoscaler/cluster-autoscaler	[no test files]
?   	k8s.io/autoscaler/cluster-autoscaler/cloudprovider	[no test files]
ok  	k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws	0.967s
ok  	k8s.io/autoscaler/cluster-autoscaler/cloudprovider/azure	0.164s
?   	k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder	[no test files]
ok  	k8s.io/autoscaler/cluster-autoscaler/cloudprovider/gce	0.828s
?   	k8s.io/autoscaler/cluster-autoscaler/cloudprovider/test	[no test files]
ok  	k8s.io/autoscaler/cluster-autoscaler/clusterstate	0.735s
ok  	k8s.io/autoscaler/cluster-autoscaler/clusterstate/api	0.091s
ok  	k8s.io/autoscaler/cluster-autoscaler/clusterstate/utils	1.147s
?   	k8s.io/autoscaler/cluster-autoscaler/config	[no test files]
?   	k8s.io/autoscaler/cluster-autoscaler/config/dynamic	[no test files]
ok  	k8s.io/autoscaler/cluster-autoscaler/core	0.248s
ok  	k8s.io/autoscaler/cluster-autoscaler/estimator	0.387s
?   	k8s.io/autoscaler/cluster-autoscaler/expander	[no test files]
?   	k8s.io/autoscaler/cluster-autoscaler/expander/factory	[no test files]
ok  	k8s.io/autoscaler/cluster-autoscaler/expander/mostpods	0.223s
ok  	k8s.io/autoscaler/cluster-autoscaler/expander/random	0.241s
ok  	k8s.io/autoscaler/cluster-autoscaler/expander/waste	1.145s
?   	k8s.io/autoscaler/cluster-autoscaler/metrics	[no test files]
ok  	k8s.io/autoscaler/cluster-autoscaler/simulator	0.281s
ok  	k8s.io/autoscaler/cluster-autoscaler/utils/deletetaint	1.031s
ok  	k8s.io/autoscaler/cluster-autoscaler/utils/drain	0.345s
?   	k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes	[no test files]
?   	k8s.io/autoscaler/cluster-autoscaler/utils/test	[no test files]

Merging manually.

@mwielgus mwielgus merged commit 55db148 into kubernetes:master Apr 18, 2017
smarterclayton pushed a commit to smarterclayton/autoscaler that referenced this pull request Jun 7, 2018
…file

Add rpm spec file and dockerfile for cluster autoscaler.
k8s-ci-robot pushed a commit that referenced this pull request Nov 14, 2019
pierre-emmanuelJ referenced this pull request in exoscale/autoscaler-1 Aug 28, 2020
Signed-off-by: Pierre-Emmanuel Jacquier <[email protected]>
lrouquette referenced this pull request in adobe-platform/autoscaler Oct 20, 2021
…1-1.1.0-adobe-with-perf-fix

Changes for scaling time improvement
k8s-ci-robot pushed a commit that referenced this pull request Apr 25, 2022
k8s-ci-robot pushed a commit that referenced this pull request Apr 25, 2022
k8s-ci-robot pushed a commit that referenced this pull request Dec 16, 2022
* Adding isNodeDeleted method to CloudProvider interface. Supports detecting whether nodes are fully deleted or are not-autoscaled. Updated cloud providers to provide initial implementation of new method that will return an ErrNotImplemented to maintain existing taint-based deletion clusterstate calculation.
navinjoy pushed a commit to navinjoy/autoscaler that referenced this pull request Jan 23, 2023
…ubernetes#1)

* Adding isNodeDeleted method to CloudProvider interface. Supports detecting whether nodes are fully deleted or are not-autoscaled. Updated cloud providers to provide initial implementation of new method that will return an ErrNotImplemented to maintain existing taint-based deletion clusterstate calculation.
k8s-ci-robot pushed a commit that referenced this pull request Apr 24, 2023
k8s-ci-robot pushed a commit that referenced this pull request Oct 3, 2023
Update CA_with_AWS_IAM_OIDC.md
voelzmo added a commit to voelzmo/autoscaler that referenced this pull request Nov 22, 2023
* Drop redudant parameter in utilization calculation

* Extract checks for scale down eligibility

* Limit amount of node utilization logging

* Increase timeout for VPA E2E

After kubernetes#5151 e2e are still failing because we're still hitting ginkgo timeout

* Add podScaleUpDelay annotation support

* Corrected the links for Priority in k8s API and Pod Preemption in k8s.

* Restrict Updater PodLister to namespace

* Update controller-gen to latest and use go install

* Run hack/generate-crd-yaml.sh

* update owners list for cluster autoscaler azure

* Change VPA default version to 0.12.0

* Pin controller-gen to 0.9.2

* AWS ReadMe update

* Move resource limits checking to a separate package

* Allow simulator to persist changes in cluster snapshot

* Don't depend on IsNodeBeingDeleted implementation

The fact that it only considers nodes as deleted only until a certain
timeout is of no concern to the eligibility.Checker.

* Stop treating masters differently in scale down

This filtering was used for two purposes:
- Excluding masters from destination candidates
- Excluding masters from calculating cluster resources

Excluding from destination candidates isn't useful: if pods can schedule
there, they will, so removing them from CA simulation doesn't change
anything.
Excluding from calculating cluster resources actually matches scale up
behavior, where master nodes are treated the same way as regular nodes.

* CA - AWS - Instance List Update 2022-09-16

* fix typo

* Modifying taint removal logic on startup to consider all nodes instead of ready nodes.

* fix typo

* Update VPA compatibility for 0.12 release

* Updated the golang version for GitHub workflow.

* Create GCE CloudProvider Owners file

* Fix error formatting in GCE client

%v results in a list of numbers when byte array is passed

* Introduce NodeDeleterBatcher to ScaleDown actuator

* handle directx nodes the same as gpu nodes

* magnum: add an option to create insecure TLS connections

We use self-signed certificates in the openstack for test purposes.
It is not always easy to bring a CA certificate. And so we ran into
the problem that there is no option to not check the validity of the
certificate in the autoscaler.

This patch adds a new option for the magnum plugin: tls-insecure

Signed-off-by: Anton Kurbatov <[email protected]>

* Drop unused maps

* Extract criteria for removing unneded nodes to a separate package

* skip instances on validation error

if an instance is already being deleted/abandoned/not a member just continue

* cleanup unused constants in clusterapi provider

this change removes some unused values and adjusts the names in the unit
tests to better reflect usage.

* Update the example spec of civo cloudprovider

Signed-off-by: Vishal Anarse <[email protected]>

* Fix race condition in scale down test

* Clean up stale OWNERS

* add example for multiple recommenders

* Balancer KEP

* Add VPA E2E for recomemndation not exaclty matching pod

Containers in recommendation can be different from recommendations in pod:

- A new container can be added to a pod. At first there will be no
  recommendation for the container
- A container can be removed from pod. For some time recommendation will contain
  recommendation for the old container
- Container can be renamed. Then there will be recommendation for container
  under its old name.

Add tests for what VPA does in those situations.

* Add VPA E2E for recomemndation not exaclty matching pod with limit range

Containers in recommendation can be different from recommendations in pod:

- A new container can be added to a pod. At first there will be no
  recommendation for the container
- A container can be removed from pod. For some time recommendation will contain
  recommendation for the old container
- Container can be renamed. Then there will be recommendation for container
  under its old name.

Add tests for what VPA does in those situations, when limit range exists.

* Remove units for default boot disk size

* Fix accessing index out of bonds

The function should match containers to their recommendations directly instead
of hoping thier order will match,

See [this comment](kubernetes#3966 (comment))

* [vpa] introduce recommendation post processor

* Fixed gofmt error.

* Don't break scale up with priority expander config

* added replicas count for daemonsets to prevent massive pod eviction

Signed-off-by: Denis Romanenko <[email protected]>

* code review, move flag to boolean for post processor

* Add support for extended resource definition in GCE MIG template

This commit adds the possibility to define extended resources for a node group on GCE,
so that the cluster-autoscaler can account for them when taking scaling decisions.

This is done through the `extended_resources` key inside the AUTOSCALER_ENV_VARS variable set on a MIG template.

Signed-off-by: Mayeul Blanzat <[email protected]>

* Make expander factory logic more pluggable

* Add option to wait for a period of time after node tainting/cordoning
Node state is refreshed and checked again before deleting the node
It gives kube-scheduler time to acknowledge that nodes state has
changed and to stop scheduling pods on them

* remove the flag for Capping post-processor

* remove unsupported functionality from cluster-api provider

this change removes the code for the `Labels` and `Taints` interface
functions of the clusterapi provider when scaling from zero. The body
of these functions was added erronesouly and the Cluster API community
is still deciding on how these values will be expose to the autoscaler.

also updates the tests and readme to be more clear about the usage of
labels and taints when scaling from zero.

* Remove ScaleDown dependency on clusterStateRegistry

* Adding support for identifying nodes that have been deleted from cloud provider that are still registered within Kubernetes. Avoids misidentifying not autoscaled nodes as deleted. Simplified implementation to use apiv1.Node instead of new struct. Expanded test cases to include not autoscaled nodes and tracking deleted nodes over multiple updates.

Adding check to backfill loop to confirm cloud provider node no longer exists before flagging the node as deleted. Modifying some comments to be more accurate. Replacing erroneous line deletion.

* Implementing new cloud provider method for node deletion detection (kubernetes#1)

* Adding isNodeDeleted method to CloudProvider interface. Supports detecting whether nodes are fully deleted or are not-autoscaled. Updated cloud providers to provide initial implementation of new method that will return an ErrNotImplemented to maintain existing taint-based deletion clusterstate calculation.

* Fixing go formatting issues with clusterstate_test

* Fixing errors due to merge on branches.

* Adjusting initial implementation of NodeExists to be consistent among cloud providers to return true and ErrNotImplemented.

* Fix list scaling group instance pages bug

Signed-off-by: jwcesign <[email protected]>

* Format log output

Signed-off-by: jwcesign <[email protected]>

* Split out code from simulator package

* Code Review: Do not return an error on malformed extended_resource + add more tests

* Malformed extended resource definition should not fail the template building function. Instead, log the error and ignore extended resources
* Remove useless existence check
* Add tests around the extractExtendedResourcesFromKubeEnv function
* Add a test case to verify that malformed extended resource definition does not fail the template build function

Signed-off-by: Mayeul Blanzat <[email protected]>

* huawei-cloudprovider:enable tags resolve for as

Signed-off-by: jwcesign <[email protected]>

* Magnum provider: switch UUID dependency from satori to gofrs

Addresses issue kubernetes#5218, that the satori UUID package
is unmaintained and has security vulnerabilities
affecting generating random UUIDs.

In the magnum cloud provider, this package was only
used to check whether a string matches a UUIDv4 or
not, so the vulnerability with generating UUIDs could
not have been exploited. (Generating UUIDs is only
done in the unit tests).

The gofrs/uuid package is currenly at version 4.0.0
in go.mod, well past point at which it was forked
and the vulnerability was fixed. It is a drop in
replacement for verifying a UUID, and only a small
change was needed in the testing code to handle
a new returned error when generating a random UUID.

* change uuid dependency in cluster autoscaler kamatera provider

* Extract scheduling hints to a dedicated object

This removes the need for passing maps back and forth when doing
scheduling simulations.

* Remove dead code for handling simulation errors

* Fix typo, move service accounts to RBAC

* VPA: Add missing --- to CRD manifests

* Base parallel scale down implementation

* Stop applying the beta.kubernetes.io/os and arch

* [CA] Register recently evicted pods in NodeDeletionTracker.

* Add KEP to introduce UpdateMode: UpscaleOnly

* Clarify prometheus use-case

* Adapt to review comments

* Adapt KEP according to review

* Add newline after header

* Rename proposal directory to fit KEP title

* Make KEP and implementation proposal consistent

* remove post-processor factory

* update test for MapToListOfRecommendedContainerResources

* Update aws OWNERS

Set all aws cloudprovider approvers as reviewers, so that aws-specific PRs can be handled without involving global CA reviewers.

* Add ScaleDown.Actuator to AutoscalingContext

* update the hyperlink of api-conventions.md file in comments

* Support scaling up node groups to the configured min size if needed

* Fix: add missing RBAC permissions to magnum examples

Adding permissions to the ClusterRole in the example to avoid the error
messages.

* make spellchecker happy

* Changing deletion logic to rely on a new helper method in ClusterStateRegistry, and remove old complicated logic. Adjust the naming of the method for cloud instance deletion from NodeExists to HasInstance.

* Fix VPA deployment

Use `kube-system` namespace for ServiceAccounts like it did before kubernetes#5268

* Don't say that `Recreate` and `Auto` VPA modes are experimental

* Fixing go formatting issue in cloudstack cloud provider code.

* Add missing cloud providers to readme and sort alphabetically

Signed-off-by: Marcus Noble <[email protected]>

* huawei-cloudprovider: enable taints resolve for as, modify the example yaml to accelerate node scale-down

Signed-off-by: jwcesign <[email protected]>

* Update cluster-autoscaler/README.md

Co-authored-by: Guy Templeton <[email protected]>

* cluster-autoscaler: refactor BalanceScaleUpBetweenGroups

* Allow forking snapshot more than 1 time

* Fork ClusterSnapshot in UpdateClusterState

* add logging information to FAQ

this change adds a section about how to increase the logging verbosity
and why you might want to do that.

* fix(cluster-autoscaler/hetzner): pre-existing volumes break scheduling

The `hcloud-csi-driver` v1.x uses the label `csi.hetzner.cloud/location`
for topology. This label was not added in the response to
`n.TemplateNodeInfo()`, causing cluster-autoscaler to not consider any
node group for scaling when a pre-existing volume was attached to the
pending pod.

This is fixed by adding the appropriatly named label to the `NodeInfo`.
In practice this label is added by the `hcloud-csi-driver`.

In the upcoming v2 of the driver we migrated to using
`apiv1.LabelZoneRegionStable` for topology constraints, but this fix is
still required so customers do not have to re-create all `PersistentVolumes`.

Further details on the bug are available in the original issue:
hetznercloud/csi-driver#302

* Added RBAC Permission to Azure.

* Log node group min and current size when skipping scale down

* Use scheduling package in filterOutSchedulable processor

* Check owner reference in scale down planner to avoid double-counting
already delete pods.

* Add note regarding GPU label for the CAPI provider

cluster-autoscaler takes into consideration the time that a node takes
to initialise a GPU resource on a node, as long as a particular label is
in place.  This label differs from provider to provider, and is
documented in some cases but not for CAPI.

This commit adds a note with the specific label that should be applied
when a node is instantiated.

* chore(cluster-autoscaler/hetzner): add myself to OWNERS file

* Use ScaleDownSetProcessor.GetNodesToRemove in scale down planner to
filter  NodesToDelete.

* Handle pagination when looking through supported shapes.

* Add OCI API files to handle OCI work-request operations.

* Fail fast if OCI instance pool is out of capacity/quota.

* update vendor to v1.26.0-rc.1

* fix issue 5332

* Deprecate v1beta1 API

v1beta2 API was introduced in kubernetes#1668, it's present in VPA
[0.4.0](https://github.com/kubernetes/autoscaler/tree/vertical-pod-autoscaler-0.4.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1beta2)
but not in
[0.3.1](https://github.com/kubernetes/autoscaler/tree/vertical-pod-autoscaler-0.3.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1beta2).

I added comments to vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1beta2/types.go

I generated changes to
`vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml` with
`vertical-pod-autoscaler/hack/generate-crd-yaml.sh`

* Add note about `v1beta2` deprecation to README

* fix issue 5332 - adding suggestied change

* Break node categorization in scale down planner on timeout.

* Automatically label cluster-autoscaler PRs

* Add missing dot

* fix generate ec2 instance types

* Introduce a formal policy for maintaining cloudproviders

The policy largely codifies what we've already been doing for years
(including the requirements we've already imposed on new providers).

* Introduce Cloudprovider Maintenance Request to policy

* feat(helm): add rancher cloud config support

Autoscaler 1.25.0 adds "rancher" cloud provider support, it requires setting cloudConfigPath. If the user mounts this as a secret and sets this value appropriately, this change sets the argument required to point to the mounted secret. Previously, this was only set if cloud provider was magnum or aws.

* Updating error messaging and fallback behavior of hasCloudProviderInstance. Changing deletedNodes to store empty struct instead of node values, and modifying the helper function to utilize that information for tests.

* Fixing helper function to simplify for loop to retrieve deleted node names.

* Use PdbRemainingDisruptions in Planner

* Put risky NodeToRemove in the end of needDrain list

* Auto Label Helm Chart PRs

* psp_api

* Create a Planner object if --parallelDrain=true

* Export execution_latency_seconds metric from VPA admission controller

Sometimes I see admissions that are slower than the rest. Logs indicate that
`AdmissionServer.admit` doesn't get slow (it's only part with logging). I'd like
to have a metric which will tell us what's slow so that we can maybe improve
that.

* aws: add nodegroup name to default labels

* Fix int formatting in threshold_based_limiter logs

* rancher-cloudprovider: Improve node group discovery

Previsouly the rancher provider tried to parse the node `spec.providerID`
to extract the node group name. Instead, we now get the machines by the
node name and then use a rancher specific label that should always be
on the machine. This should work more reliably for all the different
node drivers that rancher supports.

Signed-off-by: Cyrill Troxler <[email protected]>

* Don't add pods from drained nodes in scale-down

* Add default PodListProcessor wrapper

* Add currently drained pods before scale-up

* set cluster_autoscaler_max_nodes_count dynamically

Signed-off-by: yasin.lachiny <[email protected]>

* fix(helm): bump chart ver -> 9.21.1

* CA - AWS - Update Hardcoded Instance Details List to 11-12-2022

* Add x13n to cluster autoscaler approvers

* update prometheus metric min maxNodesCount and a.MaxNodesTotal

Signed-off-by: yasin.lachiny <[email protected]>

* CA - AWS - Update Docs all actions IAM policy

* Cluster Autoscaler: update vendor to k8s v1.26.0

* removed dotimports from framework.go

* fixed another dotimport

* add missing vpa vendor,e2e/vendor to sync branch

* removed old files from vpa vendor to fix test

---------

Signed-off-by: Anton Kurbatov <[email protected]>
Signed-off-by: Vishal Anarse <[email protected]>
Signed-off-by: Denis Romanenko <[email protected]>
Signed-off-by: Mayeul Blanzat <[email protected]>
Signed-off-by: jwcesign <[email protected]>
Signed-off-by: Marcus Noble <[email protected]>
Signed-off-by: Cyrill Troxler <[email protected]>
Signed-off-by: yasin.lachiny <[email protected]>
Co-authored-by: Daniel Kłobuszewski <[email protected]>
Co-authored-by: Kubernetes Prow Robot <[email protected]>
Co-authored-by: Joachim Bartosik <[email protected]>
Co-authored-by: Damir Markovic <[email protected]>
Co-authored-by: Shubham Kuchhal <[email protected]>
Co-authored-by: Marco Voelz <[email protected]>
Co-authored-by: Prachi Gandhi <[email protected]>
Co-authored-by: bdobay <[email protected]>
Co-authored-by: Juan Borda <[email protected]>
Co-authored-by: Fabio Berchtold <[email protected]>
Co-authored-by: Clint Fooken <[email protected]>
Co-authored-by: Jayant Jain <[email protected]>
Co-authored-by: Yaroslava Serdiuk <[email protected]>
Co-authored-by: Flavian <[email protected]>
Co-authored-by: Anton Kurbatov <[email protected]>
Co-authored-by: Fulton Byrne <[email protected]>
Co-authored-by: Michael McCune <[email protected]>
Co-authored-by: Vishal Anarse <[email protected]>
Co-authored-by: Matthias Bertschy <[email protected]>
Co-authored-by: Marcin Wielgus <[email protected]>
Co-authored-by: David Benque <[email protected]>
Co-authored-by: Denis Romanenko <[email protected]>
Co-authored-by: Mayeul Blanzat <[email protected]>
Co-authored-by: Alexandru Matei <[email protected]>
Co-authored-by: Clint <[email protected]>
Co-authored-by: jwcesign <[email protected]>
Co-authored-by: Thomas Hartland <[email protected]>
Co-authored-by: Ori Hoch <[email protected]>
Co-authored-by: Joel Smith <[email protected]>
Co-authored-by: Paco Xu <[email protected]>
Co-authored-by: Aleksandra Gacek <[email protected]>
Co-authored-by: Marco Voelz <[email protected]>
Co-authored-by: Bartłomiej Wróblewski <[email protected]>
Co-authored-by: hangcui <[email protected]>
Co-authored-by: Xintong Liu <[email protected]>
Co-authored-by: GanjMonk <[email protected]>
Co-authored-by: Marcus Noble <[email protected]>
Co-authored-by: Marcus Noble <[email protected]>
Co-authored-by: Guy Templeton <[email protected]>
Co-authored-by: Michael Grosser <[email protected]>
Co-authored-by: Julian Tölle <[email protected]>
Co-authored-by: Nick Jones <[email protected]>
Co-authored-by: jesse.millan <[email protected]>
Co-authored-by: Jordan Liggitt <[email protected]>
Co-authored-by: McGonigle, Neil <[email protected]>
Co-authored-by: Anton Khizunov <[email protected]>
Co-authored-by: Maciek Pytel <[email protected]>
Co-authored-by: Basit Mustafa <[email protected]>
Co-authored-by: xval2307 <[email protected]>
Co-authored-by: yznima <[email protected]>
Co-authored-by: Cyrill Troxler <[email protected]>
Co-authored-by: yasin.lachiny <[email protected]>
Co-authored-by: Kuba Tużnik <[email protected]>
yaroslava-serdiuk pushed a commit to yaroslava-serdiuk/autoscaler that referenced this pull request Feb 22, 2024
Add more details and links related to the project
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants