Skip to content

Commit

Permalink
Swap related work and requirements
Browse files Browse the repository at this point in the history
  • Loading branch information
timebertt committed Jan 2, 2024
1 parent fa8b12b commit 86d5cd8
Show file tree
Hide file tree
Showing 3 changed files with 51 additions and 114 deletions.
7 changes: 3 additions & 4 deletions content/35-related-work.md → content/30-related-work.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
# Related Work

- describe previous work for scaling controllers horizontally
- thesis will enhance design and implementation of study project
- many requirements are already fulfilled
- describe related work for scaling controllers horizontally
- present existing mechanisms in other community projects
- analyze which mechanisms are
- analyze advantages and drawbacks of mechanisms

## Study Project

Expand All @@ -15,6 +13,7 @@ Summary:

- implementation on controller-side
- implementation in controller-runtime, can be reused in other controllers based on controller-runtime
- cannot be reused for controllers not based on controller-runtime, or written in other programming languages
- watches are restricted to shard
- CPU and memory usage are distributed
- sharder controller required
Expand Down
29 changes: 4 additions & 25 deletions content/30-requirements.md → content/35-requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@
- distribute work across multiple instances
- use sharding mechanisms to resolve limitations and fulfill requirements of horizontally scalable controllers
- refer to scalability definition ([@sec:kubernetes-scalability]): once work can distributed, adding a new instance yields higher load capacity

## Requirements
- thesis augments requirements of study project and enhances existing design and implementation accordingly
- basic requirements are already fulfilled in study project
- study project still has scalability limitations
- based on these, new requirements are added to resolve them

Refer to requirements from study project:
[@studyproject]
Expand Down Expand Up @@ -38,26 +40,3 @@ Extended requirements:
- e.g., introduce external dependencies/infrastructure like event queue or message broker
- brings additional operational complexity, decreases comprehensibility, makes it harder to reason about
- conflicts with req. 6: external dependencies make it harder to reuse in arbitrary controllers

## Required Actions/Events

\todo[inline]{find a good name for this}

Precisely define the actions that the sharding mechanism needs to perform on which events:

- evt. 1: new object is created or object is drained (drain and shard label are not present)
- object is unassigned, assign directly
- if no shard is available, no assignment is performed (handled later on by action 2)

- evt. 2: new shard becomes available
- determine objects that should be assigned to new shard
- if object is not assigned yet, assign directly
- if object is assigned to unavailable shard, assign directly
- if object is assigned to available shard, drain object

- evt. 3: existing shard becomes unavailable
- determine objects that are assigned to shard
- assign all objects to another shard directly
- if no shard is available, unassign objects OR no assignment is performed? (handled by action 2)

\todo[inline]{make event descriptions generic, eliminate implementation-specifics}
129 changes: 44 additions & 85 deletions content/40-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,38 +3,52 @@
- design based on study project
- evolve design to address extended requirements

## Overview
## Sharding Events

Analyze design of study project: which events are handled by the sharding mechanism.
Which actions need to performed for them:

- evt. 1: new object is created or object is drained (drain and shard label are not present)
- object is unassigned, assign directly
- if no shard is available, no assignment is performed (handled later on by action 2)

Secondary requirements (design/implementation-oriented) specific sub-requirements to fulfill req. 7:
\todo[inline]{find a good spot for this}
- evt. 2: new shard becomes available
- determine objects that should be assigned to new shard
- if object is not assigned yet, assign directly
- if object is assigned to unavailable shard, assign directly
- if object is assigned to available shard, drain object

- reduce memory overhead by sharder
- eliminate cache for sharded objects (grows with the number of sharded objects)
- object cache was only needed to detect evt. 1
- find different mechanism to trigger assignments
- reduce API request volume caused by assignments and coordination
- during creation: two requests are used for creation and initial assignment
- during drain: three requests are used for starting drain, acknowledging drain, and reassignment
- non-goal: reduce API request volume of membership and failure detection
- keep lease-based membership
- evt. 3: existing shard becomes unavailable
- determine objects that are assigned to shard
- assign all objects to another shard directly
- if no shard is available, unassign objects OR no assignment is performed? (handled by action 2)

## Overview

How to address extended requirements:

- generalization: independent from controller framework and programming language
- generalization (req. 6): independent from controller framework and programming language
- addressed in step 1 ([@sec:design-external])
- move partitioning, assignment, coordination logic to external sharder
- design how to configure which objects should be sharded
- reduce memory overhead by sharder
- consider required actions again
- find different triggers for action 1 than using watch events
- reduce API request volume caused by assignments and coordination
- how to persist assignments efficiently? -> make assignments transparent without persistence?
- is also reduced with slot-based assignments

## Step 1: External Sharder
- constant overhead (req. 7): required design/implementation enhancements:
- addressed in step 2 ([@sec:design-admission])
- reduce memory overhead by sharder
- eliminate cache for sharded objects (grows with the number of sharded objects)
- consider required actions again
- object cache was only needed to detect evt. 1
- find different mechanism to trigger assignments
- reduce API request volume caused by assignments and coordination
- during creation: two requests are used for creation and initial assignment
- during drain: three requests are used for starting drain, acknowledging drain, and reassignment
- non-goal: reduce API request volume of membership and failure detection
- keep lease-based membership

## External Sharder {#sec:design-external}

Goals:

- generalization (address req. 6)
- address req. 6: generalization
- will not reduce CPU/mem overhead, only move it to an external component
- will not reduce API request volume

Expand All @@ -52,39 +66,38 @@ Problems:
- controller-side still has to comply with drain label
- must only be implemented once in the controller framework, is acceptable

## Step 2: Assignments in Admission
## Assignments in Admission {#sec:design-admission}

Goals:

- address req. 7
- address req. 7: constant overhead
- reduce CPU/mem overhead
- reduce API request volume

Ideas:

- move sharder controllers to controller manager or dedicated components
- shard labels are added to objects during admission: either in admission plugin or webhook
- when ring state changes, controller triggers reassignment or drain on all relevant objects
- admission handles action 1 (new object or object drained)
- admission handles event 1 (new object or object drained)
- handles object-related events, that can be detected solely by mutating API requests to objects
- currently, watch events (~cache) for the sharded objects are used for this
- with assignments in admission, watches and caches can be dropped
- webhook adds significant latency to mutating API requests
- only needs to act on unassigned objects -> add object selector
- controller handles action 2 and 3 (ring state changes)
- controller handles event 2 and 3 (ring state changes)
- handles ring-related events, that can be detected solely by watch events for leases
- sharder controller doesn't need to watch objects, only needs watch for leases
- action 2 (new shard)
- event 2 (new shard)
- list all objects and determine desired shard
- add drain label to all objects that are not assigned to desired shard
- action 3 (dead shard)
- event 3 (dead shard)
- list all objects assigned to dead shard
- reassign all objects immediately
- controller might interfere with itself (might act on a single object concurrently) -> use optimistic locking for all object mutations
- controller and admission view on ring state could be slightly out of date
- objects might end up on "wrong" shards
- action 1: new shard just got ready, not observed by admission, new object incorrectly assigned to another shard
- action 2: sharder drains object, controller removes drain/shard label, admission assigns to the same shard again
- event 1: new shard just got ready, not observed by admission, new object incorrectly assigned to another shard
- event 2: sharder drains object, controller removes drain/shard label, admission assigns to the same shard again
- might be acceptable
- objects are only assigned to available shards
- single responsible shard is guaranteed
Expand All @@ -100,57 +113,3 @@ Summary:
- trades resource overhead (object cache) for a few API requests (lists) and latency (webhook)
- latency can be reduced with object selector and/or by moving to admission plugin
- reduces API request volume a bit because drain and new assignment are now combined into a single API request

<!--
## Approach 1: Transparent Assignments
Goals:
- reduce CPU/mem overhead
- reduce API request volume
Ideas:
- move lease controller to controller manager as in step 1
- teach API server to calculate assignments in watch cache, piggy-back on caches -> reduce resource overhead
- make assignments transparent, don't persist in etcd -> reduce API request volume
- no resource version bumps, no watch events!?
- how are are watch events triggered on assignment changes?
- investigate how CR of CRDs handle this
- custom resource watch terminates when CRD spec/schema changes
- terminating the watch connection would cause a re-list
- terminating watches on assignment changes is not enough
- controller will restart the watch with the last observed resource version
- without bumps to resource version, there will be no new watch events
- we still can't be sure if the controller observed the change
- preventing concurrency: how is drain handled?
- reassignment could send a `DELETE` event just like a label change on watches with label selector
- API servers need to ensure that the client observed the change
- client sends assignment label back to API server in patch/update request as prerequisite
- request is rejected with a conflict error if assignment doesn't match (similar to optimistic locking)
- doesn't work on owned objects
- assignments and coordination must be consistent across API server instances
-->

<!--
## Approach 3: Slot-based Assignments
Goals:
- reduce API request volume
- reduce CPU/mem overhead?
Ideas:
- move all sharder controllers to controller manager
- assign objects in bulk like in redis
- don't persist assignments on object labels
- persist slot assignments on leases?
- preventing concurrency requires controllers to look up slot assignments in object reconciliation
- hard to achieve consistency with this?
- watch per slot? -> significant implementation effort in every language/framework
- lease per slot
- too high request volume for coordination (ref knative)
- shard needs to acknowledge slot movement
- include some kind of observed generation number of assignments in regular lease updates
-->

0 comments on commit 86d5cd8

Please sign in to comment.