Skip to content

Commit

Permalink
GT-509 Remove scale_down_candidate annotation
Browse files Browse the repository at this point in the history
  • Loading branch information
jwierzbo committed Nov 3, 2023
1 parent 70d22ed commit 2fcdfee
Show file tree
Hide file tree
Showing 10 changed files with 20 additions and 139 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
- (Feature) Add ArangoMember Message and extend ArangoMember CRD
- (Documentation) Use OpenAPI-compatible type names in docs
- (Improvement) Use agency cache lock in metrics exporter
- (Maintenance) Remove `scale_down_candidate` annotation

## [1.2.34](https://github.com/arangodb/kube-arangodb/tree/1.2.34) (2023-10-16)
- (Bugfix) Fix make manifests-crd-file command
Expand Down
26 changes: 15 additions & 11 deletions docs/scaling.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
# Scaling your ArangoDB deployment

The ArangoDB Kubernetes Operator supports up and down scaling of
the number of DB-Servers & Coordinators.
The ArangoDB Kubernetes Operator allows easily scale the number of DB-Servers and Coordinators up or down as needed.

The scale up or down, change the number of servers in the custom
resource.
The scale up or down, change the number of servers in the custom resource.

E.g. change `spec.dbservers.count` from `3` to `4`.
E.g., change `spec.dbservers.count` from `3` to `4`.

Then apply the updated resource using:

```bash
kubectl apply -f yourCustomResourceFile.yaml
kubectl apply -f {your-arango-deployment}.yaml
```

Inspect the status of the custom resource to monitor the progress of the scaling operation.
Expand All @@ -25,18 +23,24 @@ Make sure to specify the desired number when creating CR first time.
### Scale-up

When increasing the `count`, operator will try to create missing pods.
When scaling up, make sure that you have enough computational resources / nodes, otherwise pod will stuck in Pending state.
When scaling up, make sure that you have enough computational resources / nodes, otherwise pod will be stuck in Pending state.


### Scale-down

Scaling down is always done 1 server at a time.
Scaling down is always done one server at a time.

Scale down is possible only when all other actions on ArangoDeployment are finished.

The internal process followed by the ArangoDB operator when scaling up is as follows:
- It chooses a member to be evicted. First it will try to remove unhealthy members or fall-back to the member with highest deletion_priority.
The internal process followed by the ArangoDB operator when scaling down is as follows:
- It chooses a member to be evicted. First, it will try to remove unhealthy members or fall-back to the member with
the highest `deletion_priority` (check [Use deletion_priority to control scale-down order](#use-deletion_priority-to-control-scale-down-order)).
- Making an internal calls, it forces the server to resign leadership.
In case of DB servers it means that all shard leaders will be switched to other servers.
- Wait until server is cleaned out from cluster.
- Wait until server is cleaned out from the cluster.
- Pod finalized.

#### Use deletion_priority to control scale-down order

You can use `.spec.deletion_priority` field in `ArangoMember` CR to control the order in which servers are scaled down.
Refer to [ArangoMember API Reference](/docs/api/ArangoMember.V1.md#specdeletionpriority-integer) for more details.
3 changes: 0 additions & 3 deletions pkg/apis/deployment/annotations.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,4 @@ const (
ArangoDeploymentPodReplaceAnnotation = ArangoDeploymentAnnotationPrefix + "/replace"
ArangoDeploymentPodDeleteNow = ArangoDeploymentAnnotationPrefix + "/delete_now"
ArangoDeploymentPlanCleanAnnotation = "plan." + ArangoDeploymentAnnotationPrefix + "/clean"

// Deprecated: use ArangoMemberSpec.DeletionPriority instead
ArangoDeploymentPodScaleDownCandidateAnnotation = ArangoDeploymentAnnotationPrefix + "/scale_down_candidate"
)
3 changes: 2 additions & 1 deletion pkg/apis/deployment/v1/conditions.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,8 @@ const (
ConditionTypeMemberVolumeUnschedulable ConditionType = "MemberVolumeUnschedulable"
// ConditionTypeMarkedToRemove indicates that the member is marked to be removed.
ConditionTypeMarkedToRemove ConditionType = "MarkedToRemove"
// ConditionTypeScaleDownCandidate indicates that the member will be picked in ScaleDown operaion.
// ConditionTypeScaleDownCandidate indicates that the member will be picked in ScaleDown operation (Currently not used).
// @deprecated will be removed in 1.3.0
ConditionTypeScaleDownCandidate ConditionType = "ScaleDownCandidate"
// ConditionTypeUpgradeFailed indicates that upgrade failed
ConditionTypeUpgradeFailed ConditionType = "UpgradeFailed"
Expand Down
3 changes: 2 additions & 1 deletion pkg/apis/deployment/v1/member_status_list.go
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,9 @@ func (l *MemberStatusList) removeByID(id string) error {
type MemberToRemoveSelector func(m MemberStatusList) (string, error)

// SelectMemberToRemove selects a member from the given list that should
// be removed in a scale down action.
// be removed in a ScaleDown action.
// Returns an error if the list is empty.
// Deprecated: will be removed in 1.3.0 since ScaleDown annotation is already removed
func (l MemberStatusList) SelectMemberToRemove(selectors ...MemberToRemoveSelector) (MemberStatus, error) {
if len(l) > 0 {
// Try to find member with phase to be removed
Expand Down
2 changes: 0 additions & 2 deletions pkg/apis/deployment/v2alpha1/conditions.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,6 @@ const (
ConditionTypeMemberVolumeUnschedulable ConditionType = "MemberVolumeUnschedulable"
// ConditionTypeMarkedToRemove indicates that the member is marked to be removed.
ConditionTypeMarkedToRemove ConditionType = "MarkedToRemove"
// ConditionTypeScaleDownCandidate indicates that the member will be picked in ScaleDown operaion.
ConditionTypeScaleDownCandidate ConditionType = "ScaleDownCandidate"
// ConditionTypeUpgradeFailed indicates that upgrade failed
ConditionTypeUpgradeFailed ConditionType = "UpgradeFailed"
// ConditionTypeArchitectureMismatch indicates that the member has a different architecture than the deployment.
Expand Down
62 changes: 0 additions & 62 deletions pkg/apis/deployment/v2alpha1/member_status_list.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ import (

core "k8s.io/api/core/v1"

"github.com/arangodb/kube-arangodb/pkg/util"
"github.com/arangodb/kube-arangodb/pkg/util/errors"
)

Expand Down Expand Up @@ -138,67 +137,6 @@ func (l *MemberStatusList) removeByID(id string) error {

type MemberToRemoveSelector func(m MemberStatusList) (string, error)

// SelectMemberToRemove selects a member from the given list that should
// be removed in a scale down action.
// Returns an error if the list is empty.
func (l MemberStatusList) SelectMemberToRemove(selectors ...MemberToRemoveSelector) (MemberStatus, error) {
if len(l) > 0 {
// Try to find member with phase to be removed
for _, m := range l {
if m.Conditions.IsTrue(ConditionTypeMarkedToRemove) {
return m, nil
}
}
for _, m := range l {
if m.Conditions.IsTrue(ConditionTypeScaleDownCandidate) {
return m, nil
}
}
// Try to find a not ready member
for _, m := range l {
if m.Phase.IsPending() {
return m, nil
}
}
for _, m := range l {
if !m.Conditions.IsTrue(ConditionTypeReady) {
return m, nil
}
}
for _, m := range l {
if m.Conditions.IsTrue(ConditionTypeCleanedOut) {
return m, nil
}
}

// Run conditional picker
for _, selector := range selectors {
if selector == nil {
continue
}
if m, err := selector(l); err != nil {
return MemberStatus{}, err
} else if m != "" {
if member, ok := l.ElementByID(m); ok {
return member, nil
} else {
return MemberStatus{}, errors.Newf("Unable to find member with id %s", m)
}
}
}

// Pick a random member that is in created state
perm := util.Rand().Perm(len(l))
for _, idx := range perm {
m := l[idx]
if m.Phase == MemberPhaseCreated {
return m, nil
}
}
}
return MemberStatus{}, errors.WithStack(errors.Wrap(NotFoundError, "No member available for removal"))
}

// MembersReady returns the number of members that are in the Ready state.
func (l MemberStatusList) MembersReady() int {
readyCount := 0
Expand Down
1 change: 0 additions & 1 deletion pkg/deployment/reconcile/plan_builder_high.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@ func (r *Reconciler) createHighPlan(ctx context.Context, apiObject k8sutil.APIOb
ApplyIfEmpty(r.updateMemberConditionTypeMemberVolumeUnschedulableCondition).
ApplyIfEmpty(r.createRebalancerCheckPlanCore).
ApplyIfEmpty(r.createMemberFailedRestoreHighPlan).
ApplyIfEmpty(r.scaleDownCandidate).
ApplyIfEmpty(r.volumeMemberReplacement).
ApplyWithBackOff(BackOffCheck, time.Minute, r.emptyPlanBuilder)).
ApplyIfEmptyWithBackOff(TimezoneCheck, time.Minute, r.createTimezoneUpdatePlan).
Expand Down
45 changes: 0 additions & 45 deletions pkg/deployment/reconcile/plan_builder_scale.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ package reconcile
import (
"context"

"github.com/arangodb/kube-arangodb/pkg/apis/deployment"
api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1"
"github.com/arangodb/kube-arangodb/pkg/deployment/actions"
"github.com/arangodb/kube-arangodb/pkg/deployment/reconcile/shared"
Expand Down Expand Up @@ -166,47 +165,3 @@ func (r *Reconciler) createReplaceMemberPlan(ctx context.Context, apiObject k8su
func filterScaleUP(a api.Action) bool {
return a.Type == api.ActionTypeAddMember
}

func (r *Reconciler) scaleDownCandidate(ctx context.Context, apiObject k8sutil.APIObject,
spec api.DeploymentSpec, status api.DeploymentStatus,
context PlanBuilderContext) api.Plan {
var plan api.Plan

for _, m := range status.Members.AsList() {
cache, ok := context.ACS().ClusterCache(m.Member.ClusterID)
if !ok {
continue
}

annotationExists := false

am, ok := cache.ArangoMember().V1().GetSimple(m.Member.ArangoMemberName(context.GetName(), m.Group))
if !ok {
continue
}

//nolint:staticcheck
if _, ok := am.Annotations[deployment.ArangoDeploymentPodScaleDownCandidateAnnotation]; ok {
annotationExists = true
}

if pod, ok := cache.Pod().V1().GetSimple(m.Member.Pod.GetName()); ok {
//nolint:staticcheck
if _, ok := pod.Annotations[deployment.ArangoDeploymentPodScaleDownCandidateAnnotation]; ok {
annotationExists = true
}
}

conditionExists := m.Member.Conditions.IsTrue(api.ConditionTypeScaleDownCandidate)

if annotationExists != conditionExists {
if annotationExists {
plan = append(plan, shared.UpdateMemberConditionActionV2("Marked as ScaleDownCandidate", api.ConditionTypeScaleDownCandidate, m.Group, m.Member.ID, true, "Marked as ScaleDownCandidate", "", ""))
} else {
plan = append(plan, shared.RemoveMemberConditionActionV2("Unmarked as ScaleDownCandidate", api.ConditionTypeScaleDownCandidate, m.Group, m.Member.ID))
}
}
}

return plan
}
13 changes: 0 additions & 13 deletions pkg/deployment/reconcile/plan_builder_scale_funcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ import (
func planBuilderScaleDownFilter(context PlanBuilderContext, status api.DeploymentStatus, group api.ServerGroup, in api.MemberStatusList) (api.MemberStatus, error) {
return NewScaleFilter(context, status, group, in).
Filter(planBuilderScaleDownSelectMarkedToRemove).
Filter(planBuilderScaleDownSelectScaleDownCandidateCondition).
Filter(planBuilderScaleDownSelectCleanedOutCondition).
Filter(planBuilderScaleDownCleanedServers).
Filter(planBuilderScaleDownToBeCleanedServers).
Expand Down Expand Up @@ -107,18 +106,6 @@ func planBuilderScaleDownSelectCleanedOutCondition(context PlanBuilderContext, s
return r, len(r) > 0, nil
}

func planBuilderScaleDownSelectScaleDownCandidateCondition(context PlanBuilderContext, status api.DeploymentStatus, group api.ServerGroup, in api.MemberStatusList) (api.MemberStatusList, bool, error) {
r := make(api.MemberStatusList, 0, len(in))

for _, el := range in {
if el.Conditions.IsTrue(api.ConditionTypeScaleDownCandidate) {
r = append(r, el)
}
}

return r, len(r) > 0, nil
}

func planBuilderScaleDownCleanedServers(context PlanBuilderContext, status api.DeploymentStatus, group api.ServerGroup, in api.MemberStatusList) (api.MemberStatusList, bool, error) {
if group != api.ServerGroupDBServers {
return nil, false, nil
Expand Down

0 comments on commit 2fcdfee

Please sign in to comment.