From 86d5cd8938f56000f9937997474d8453e96e8fb3 Mon Sep 17 00:00:00 2001
From: Tim Ebert <timebertt@gmail.com>
Date: Tue, 2 Jan 2024 09:34:39 +0100
Subject: [PATCH] Swap related work and requirements

---
 ...{35-related-work.md => 30-related-work.md} |   7 +-
 ...{30-requirements.md => 35-requirements.md} |  29 +---
 content/40-design.md                          | 129 ++++++------------
 3 files changed, 51 insertions(+), 114 deletions(-)
 rename content/{35-related-work.md => 30-related-work.md} (96%)
 rename content/{30-requirements.md => 35-requirements.md} (69%)

diff --git a/content/35-related-work.md b/content/30-related-work.md
similarity index 96%
rename from content/35-related-work.md
rename to content/30-related-work.md
index 65c32cf..df64b6d 100644
--- a/content/35-related-work.md
+++ b/content/30-related-work.md
@@ -1,10 +1,8 @@
 # Related Work
 
-- describe previous work for scaling controllers horizontally
-- thesis will enhance design and implementation of study project
-  - many requirements are already fulfilled
+- describe related work for scaling controllers horizontally
 - present existing mechanisms in other community projects
-- analyze which mechanisms are
+- analyze advantages and drawbacks of mechanisms
 
 ## Study Project
 
@@ -15,6 +13,7 @@ Summary:
 
 - implementation on controller-side
 - implementation in controller-runtime, can be reused in other controllers based on controller-runtime
+  - cannot be reused for controllers not based on controller-runtime, or written in other programming languages
 - watches are restricted to shard
   - CPU and memory usage are distributed
 - sharder controller required
diff --git a/content/30-requirements.md b/content/35-requirements.md
similarity index 69%
rename from content/30-requirements.md
rename to content/35-requirements.md
index 9fb14f4..abfcee2 100644
--- a/content/30-requirements.md
+++ b/content/35-requirements.md
@@ -8,8 +8,10 @@
   - distribute work across multiple instances
 - use sharding mechanisms to resolve limitations and fulfill requirements of horizontally scalable controllers
 - refer to scalability definition ([@sec:kubernetes-scalability]): once work can distributed, adding a new instance yields higher load capacity
-
-## Requirements
+- thesis augments requirements of study project and enhances existing design and implementation accordingly
+  - basic requirements are already fulfilled in study project
+  - study project still has scalability limitations
+  - based on these, new requirements are added to resolve them
 
 Refer to requirements from study project:
 [@studyproject]
@@ -38,26 +40,3 @@ Extended requirements:
   - e.g., introduce external dependencies/infrastructure like event queue or message broker
   - brings additional operational complexity, decreases comprehensibility, makes it harder to reason about
   - conflicts with req. 6: external dependencies make it harder to reuse in arbitrary controllers
-
-## Required Actions/Events
-
-\todo[inline]{find a good name for this}
-
-Precisely define the actions that the sharding mechanism needs to perform on which events:
-
-- evt. 1: new object is created or object is drained (drain and shard label are not present)
-  - object is unassigned, assign directly
-  - if no shard is available, no assignment is performed (handled later on by action 2)
-
-- evt. 2: new shard becomes available
-  - determine objects that should be assigned to new shard
-  - if object is not assigned yet, assign directly
-  - if object is assigned to unavailable shard, assign directly
-  - if object is assigned to available shard, drain object
-
-- evt. 3: existing shard becomes unavailable
-  - determine objects that are assigned to shard
-  - assign all objects to another shard directly
-  - if no shard is available, unassign objects OR no assignment is performed? (handled by action 2)
-
-\todo[inline]{make event descriptions generic, eliminate implementation-specifics}
diff --git a/content/40-design.md b/content/40-design.md
index f1e17c7..07b33cc 100644
--- a/content/40-design.md
+++ b/content/40-design.md
@@ -3,38 +3,52 @@
 - design based on study project
 - evolve design to address extended requirements
 
-## Overview
+## Sharding Events
+
+Analyze design of study project: which events are handled by the sharding mechanism.
+Which actions need to performed for them:
+
+- evt. 1: new object is created or object is drained (drain and shard label are not present)
+  - object is unassigned, assign directly
+  - if no shard is available, no assignment is performed (handled later on by action 2)
 
-Secondary requirements (design/implementation-oriented) specific sub-requirements to fulfill req. 7:
-\todo[inline]{find a good spot for this}
+- evt. 2: new shard becomes available
+  - determine objects that should be assigned to new shard
+  - if object is not assigned yet, assign directly
+  - if object is assigned to unavailable shard, assign directly
+  - if object is assigned to available shard, drain object
 
-- reduce memory overhead by sharder
-  - eliminate cache for sharded objects (grows with the number of sharded objects)
-  - object cache was only needed to detect evt. 1
-  - find different mechanism to trigger assignments
-- reduce API request volume caused by assignments and coordination
-  - during creation: two requests are used for creation and initial assignment
-  - during drain: three requests are used for starting drain, acknowledging drain, and reassignment
-- non-goal: reduce API request volume of membership and failure detection
-  - keep lease-based membership
+- evt. 3: existing shard becomes unavailable
+  - determine objects that are assigned to shard
+  - assign all objects to another shard directly
+  - if no shard is available, unassign objects OR no assignment is performed? (handled by action 2)
+
+## Overview
 
 How to address extended requirements:
 
-- generalization: independent from controller framework and programming language
+- generalization (req. 6): independent from controller framework and programming language
+  - addressed in step 1 ([@sec:design-external])
   - move partitioning, assignment, coordination logic to external sharder
   - design how to configure which objects should be sharded
-- reduce memory overhead by sharder
-  - consider required actions again
-  - find different triggers for action 1 than using watch events
-- reduce API request volume caused by assignments and coordination
-  - how to persist assignments efficiently? -> make assignments transparent without persistence?
-  - is also reduced with slot-based assignments
-
-## Step 1: External Sharder
+- constant overhead (req. 7): required design/implementation enhancements:
+  - addressed in step 2 ([@sec:design-admission])
+  - reduce memory overhead by sharder
+    - eliminate cache for sharded objects (grows with the number of sharded objects)
+    - consider required actions again
+    - object cache was only needed to detect evt. 1
+    - find different mechanism to trigger assignments
+  - reduce API request volume caused by assignments and coordination
+    - during creation: two requests are used for creation and initial assignment
+    - during drain: three requests are used for starting drain, acknowledging drain, and reassignment
+  - non-goal: reduce API request volume of membership and failure detection
+    - keep lease-based membership
+
+## External Sharder {#sec:design-external}
 
 Goals:
 
-- generalization (address req. 6)
+- address req. 6: generalization
 - will not reduce CPU/mem overhead, only move it to an external component
 - will not reduce API request volume
 
@@ -52,39 +66,38 @@ Problems:
 - controller-side still has to comply with drain label
   - must only be implemented once in the controller framework, is acceptable
 
-## Step 2: Assignments in Admission
+## Assignments in Admission {#sec:design-admission}
 
 Goals:
 
-- address req. 7
+- address req. 7: constant overhead
 - reduce CPU/mem overhead
 - reduce API request volume
 
 Ideas:
 
-- move sharder controllers to controller manager or dedicated components
 - shard labels are added to objects during admission: either in admission plugin or webhook
 - when ring state changes, controller triggers reassignment or drain on all relevant objects
-- admission handles action 1 (new object or object drained)
+- admission handles event 1 (new object or object drained)
   - handles object-related events, that can be detected solely by mutating API requests to objects
   - currently, watch events (~cache) for the sharded objects are used for this
   - with assignments in admission, watches and caches can be dropped
   - webhook adds significant latency to mutating API requests
     - only needs to act on unassigned objects -> add object selector
-- controller handles action 2 and 3 (ring state changes)
+- controller handles event 2 and 3 (ring state changes)
   - handles ring-related events, that can be detected solely by watch events for leases
   - sharder controller doesn't need to watch objects, only needs watch for leases
-  - action 2 (new shard)
+  - event 2 (new shard)
     - list all objects and determine desired shard
     - add drain label to all objects that are not assigned to desired shard
-  - action 3 (dead shard)
+  - event 3 (dead shard)
     - list all objects assigned to dead shard
     - reassign all objects immediately
   - controller might interfere with itself (might act on a single object concurrently) -> use optimistic locking for all object mutations
 - controller and admission view on ring state could be slightly out of date
   - objects might end up on "wrong" shards
-    - action 1: new shard just got ready, not observed by admission, new object incorrectly assigned to another shard
-    - action 2: sharder drains object, controller removes drain/shard label, admission assigns to the same shard again
+    - event 1: new shard just got ready, not observed by admission, new object incorrectly assigned to another shard
+    - event 2: sharder drains object, controller removes drain/shard label, admission assigns to the same shard again
   - might be acceptable
     - objects are only assigned to available shards
     - single responsible shard is guaranteed
@@ -100,57 +113,3 @@ Summary:
 - trades resource overhead (object cache) for a few API requests (lists) and latency (webhook)
 - latency can be reduced with object selector and/or by moving to admission plugin
 - reduces API request volume a bit because drain and new assignment are now combined into a single API request
-
-<!--
-## Approach 1: Transparent Assignments
-
-Goals:
-
-- reduce CPU/mem overhead
-- reduce API request volume
-
-Ideas:
-
-- move lease controller to controller manager as in step 1
-- teach API server to calculate assignments in watch cache, piggy-back on caches -> reduce resource overhead
-- make assignments transparent, don't persist in etcd -> reduce API request volume
-  - no resource version bumps, no watch events!?
-  - how are are watch events triggered on assignment changes?
-    - investigate how CR of CRDs handle this
-    - custom resource watch terminates when CRD spec/schema changes
-    - terminating the watch connection would cause a re-list
-    - terminating watches on assignment changes is not enough
-      - controller will restart the watch with the last observed resource version
-      - without bumps to resource version, there will be no new watch events
-      - we still can't be sure if the controller observed the change
-- preventing concurrency: how is drain handled?
-  - reassignment could send a `DELETE` event just like a label change on watches with label selector
-  - API servers need to ensure that the client observed the change
-  - client sends assignment label back to API server in patch/update request as prerequisite
-    - request is rejected with a conflict error if assignment doesn't match (similar to optimistic locking)
-    - doesn't work on owned objects
-- assignments and coordination must be consistent across API server instances
--->
-
-<!--
-## Approach 3: Slot-based Assignments
-
-Goals:
-
-- reduce API request volume
-- reduce CPU/mem overhead?
-
-Ideas:
-
-- move all sharder controllers to controller manager
-- assign objects in bulk like in redis
-- don't persist assignments on object labels
-- persist slot assignments on leases?
-- preventing concurrency requires controllers to look up slot assignments in object reconciliation
-- hard to achieve consistency with this?
-- watch per slot? -> significant implementation effort in every language/framework
-- lease per slot
-  - too high request volume for coordination (ref knative)
-- shard needs to acknowledge slot movement
-  - include some kind of observed generation number of assignments in regular lease updates
--->