Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New expander: priority expander #1801

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cluster-autoscaler/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ cluster_autoscaler
*.un~
Session.vim
.netrwhist
.vscode
7 changes: 5 additions & 2 deletions cluster-autoscaler/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -590,7 +590,7 @@ new nodes will be added.
Expanders can be selected by passing the name to the `--expander` flag, i.e.
`./cluster-autoscaler --expander=random`.

Currently Cluster Autoscaler has 4 expanders:
Currently Cluster Autoscaler has 5 expanders:

* `random` - this is the default expander, and should be used when you don't have a particular
need for the node groups to scale differently.
Expand All @@ -607,12 +607,15 @@ after scale-up. This is useful when you have different classes of nodes, for exa
would match the cluster size. This expander is described in more details
[HERE](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/pricing.md). Currently it works only for GCE and GKE (patches welcome.)

* `priority` - selects the node group that has the highest priority assigned by the user. It's configuration is described in more details [here](expander/priority/readme.md)

### Does CA respect node affinity when selecting node groups to scale up?

CA respects `nodeSelector` and `requiredDuringSchedulingIgnoredDuringExecution` in nodeAffinity given that you have labelled your node groups accordingly. If there is a pod that cannot be scheduled with either `nodeSelector` or `requiredDuringSchedulingIgnoredDuringExecution` specified, CA will only consider node groups that satisfy those requirements for expansion.

However, CA does not consider "soft" constraints like `preferredDuringSchedulingIgnoredDuringExecution` when selecting node groups. That means that if CA has two or more node groups available for expansion, it will not use soft constraints to pick one node group over another.

************
****************

### What are the parameters to CA?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1beta1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ rules:
verbs: ["create"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["delete","get","update"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete","get","update","watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
Expand Down
2 changes: 1 addition & 1 deletion cluster-autoscaler/core/autoscaler.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ func initializeDefaultOptions(opts *AutoscalerOptions) error {
}
if opts.ExpanderStrategy == nil {
expanderStrategy, err := factory.ExpanderStrategyFromString(opts.ExpanderName,
opts.CloudProvider, opts.AutoscalingKubeClients.AllNodeLister())
opts.CloudProvider, opts.AutoscalingKubeClients, opts.KubeClient, opts.ConfigNamespace)
if err != nil {
return err
}
Expand Down
4 changes: 3 additions & 1 deletion cluster-autoscaler/expander/expander.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ import (

var (
// AvailableExpanders is a list of available expander options
AvailableExpanders = []string{RandomExpanderName, MostPodsExpanderName, LeastWasteExpanderName, PriceBasedExpanderName}
AvailableExpanders = []string{RandomExpanderName, MostPodsExpanderName, LeastWasteExpanderName, PriceBasedExpanderName, PriorityBasedExpanderName}
// RandomExpanderName selects a node group at random
RandomExpanderName = "random"
// MostPodsExpanderName selects a node group that fits the most pods
Expand All @@ -34,6 +34,8 @@ var (
// PriceBasedExpanderName selects a node group that is the most cost-effective and consistent with
// the preferred node size for the cluster
PriceBasedExpanderName = "price"
// PriorityBasedExpanderName selects a node group based on a user-configured priorities assigned to group names
PriorityBasedExpanderName = "priority"
)

// Option describes an option to expand the cluster.
Expand Down
17 changes: 13 additions & 4 deletions cluster-autoscaler/expander/factory/expander_factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,21 @@ package factory

import (
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
"k8s.io/autoscaler/cluster-autoscaler/context"
"k8s.io/autoscaler/cluster-autoscaler/expander"
"k8s.io/autoscaler/cluster-autoscaler/expander/mostpods"
"k8s.io/autoscaler/cluster-autoscaler/expander/price"
"k8s.io/autoscaler/cluster-autoscaler/expander/priority"
"k8s.io/autoscaler/cluster-autoscaler/expander/random"
"k8s.io/autoscaler/cluster-autoscaler/expander/waste"
"k8s.io/autoscaler/cluster-autoscaler/utils/errors"

kube_util "k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes"
kube_client "k8s.io/client-go/kubernetes"
)

// ExpanderStrategyFromString creates an expander.Strategy according to its name
func ExpanderStrategyFromString(expanderFlag string, cloudProvider cloudprovider.CloudProvider,
nodeLister kube_util.NodeLister) (expander.Strategy, errors.AutoscalerError) {
autoscalingKubeClients *context.AutoscalingKubeClients, kubeClient kube_client.Interface,
configNamespace string) (expander.Strategy, errors.AutoscalerError) {
switch expanderFlag {
case expander.RandomExpanderName:
return random.NewStrategy(), nil
Expand All @@ -44,8 +46,15 @@ func ExpanderStrategyFromString(expanderFlag string, cloudProvider cloudprovider
return nil, err
}
return price.NewStrategy(pricing,
price.NewSimplePreferredNodeProvider(nodeLister),
price.NewSimplePreferredNodeProvider(autoscalingKubeClients.AllNodeLister()),
price.SimpleNodeUnfitness), nil
case expander.PriorityBasedExpanderName:
maps := kubeClient.CoreV1().ConfigMaps(configNamespace)
initialPriorities, priorityChangesChan, err := priority.InitPriorityConfigMap(maps, configNamespace)
if err != nil {
return nil, errors.ToAutoscalerError(errors.InternalError, err)
}
return priority.NewStrategy(initialPriorities, priorityChangesChan, autoscalingKubeClients.LogRecorder)
}
return nil, errors.NewAutoscalerError(errors.InternalError, "Expander %s not supported", expanderFlag)
}
73 changes: 73 additions & 0 deletions cluster-autoscaler/expander/priority/configmap_handler.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
/*
Copyright 2016 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package priority

import (
"errors"
"fmt"

kube_errors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/watch"
v1 "k8s.io/client-go/kubernetes/typed/core/v1"
"k8s.io/klog"
api "k8s.io/kubernetes/pkg/apis/core"
)

const (
// PriorityConfigMapName defines a name of the ConfigMap used to store priority expander configuration
PriorityConfigMapName = "cluster-autoscaler-priority-expander"
// ConfigMapKey defines the key used in the ConfigMap to configure priorities
ConfigMapKey = "priorities"
)

// InitPriorityConfigMap initializes ConfigMap with priority expander configurations. It checks if the map exists
// and has the correct top level key. If it doesn't, it returns error or Exits. If the value is found,
// the current value is fetched and a Watcher is started to watch for changes. It returns the current value of
// the config map, the channel with value updates and an error.
func InitPriorityConfigMap(maps v1.ConfigMapInterface, namespace string) (string, <-chan watch.Event, error) {
errMsg := ""
priorities := ""

configMap, getStatusError := maps.Get(PriorityConfigMapName, metav1.GetOptions{})
if getStatusError == nil {
priorities = configMap.Data[ConfigMapKey]
} else if kube_errors.IsNotFound(getStatusError) {
errMsg = fmt.Sprintf("Priority expander config map %s/%s not found. You have to create it before starting cluster-autoscaler "+
"with priority expander.", namespace, PriorityConfigMapName)
} else {
errMsg = fmt.Sprintf("Failed to retrieve priority expander configmap %s/%s: %v", namespace, PriorityConfigMapName,
getStatusError)
}
if errMsg != "" {
return "", nil, errors.New(errMsg)
}

watcher, err := maps.Watch(metav1.ListOptions{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listers already handle watch cache and all necessary error handling and we use those for all other objects. I don't think we should re-implement this functionality. In particular if you want to use watch directly you need to handle channel closing - in both HPA and VPA we've already run into issues caused by watch channel abruptly closing and we needed to implement a retry logic for this case. I don't see such retry logic in this PR.
That being said - unless you have a very strong reason I strongly recommend using a lister or an informer, rather than directly using watch API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about any problems using watch API - when is the channel closed? Can you point me to some docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there is any documentation on the details. IIRC @jbartosik once run into similar issue, so he may provide more details. The general recommendation is to always use higher-level abstractions provided by client-go library, rather than raw client.

FieldSelector: fields.OneTermEqualSelector(api.ObjectNameField, PriorityConfigMapName).String(),
Watch: true,
})
if err != nil {
errMsg = fmt.Sprintf("Error when starting a watcher for changes of the priority expander configmap %s/%s: %v",
namespace, PriorityConfigMapName, err)
klog.Errorf(errMsg)
return "", nil, errors.New(errMsg)
}

return priorities, watcher.ResultChan(), nil
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-priority-expander
data:
priorities: |-
10:
- .*t2\.large.*
- .*t3\.large.*
50:
- .*m4\.4xlarge.*
Loading