From 3a2c17b4956e91d7ec08faeaf6413890ac0e1254 Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Fri, 14 Feb 2020 14:34:52 -0800 Subject: [PATCH 1/7] Update Condition guidance --- .../devel/sig-architecture/api-conventions.md | 62 ++++++++++++------- 1 file changed, 41 insertions(+), 21 deletions(-) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index fff28097694..578046b5c63 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -308,17 +308,47 @@ response reduces the complexity of these clients. ##### Typical status properties -**Conditions** represent the latest available observations of an object's -state. They are an extension mechanism intended to be used when the details of -an observation are not a priori known or would not apply to all instances of a -given Kind. For observations that are well known and apply to all instances, a -regular field is preferred. An example of a Condition that probably should -have been a regular field is Pod's "Ready" condition - it is managed by core -controllers, it is well understood, and it applies to all Pods. +**Conditions** provide a standard mechanism for higher-level status reporting +from a controller. They are an extension mechanism which allows tools and other +controllers to collect summary information about resources without needing to +understand resource-specific status details. Conditions should complement more +detailed information about the observed status of an object written by a +controller, rather than replace it. For example, the "Available" condition of a +Deployment can be determined by examining `readyReplicas`, `replicas`, and +other properties of the Deployment. However, the "Available" condition allows +other components to avoid duplicating the availability logic in the Deployment +controller. Objects may report multiple conditions, and new types of conditions may be added in the future or by 3rd party controllers. Therefore, conditions are -represented using a list/slice, where all have similar structure. +represented using a list/slice of objects, where each condition has a similar +structure. Conditions are most useful when they follow some consistent +conventions: + +* Conditions should be added to explicitly convey properties that users and + components care about rather than requiring those properties to be inferred + from other observations. Once defined, the meaning of a Condition can not be + changed arbitrarily - it becomes part of the API, and has the same backwards- + and forwards-compatibility concerns of any other part of the API. + +* Conditions should have a consistent polarity. **A "positive" polarity where + "True" indicates normal operation and "False" indicates a failure is + recommended.** (Note that this is an explicit reversal from earlier + recommendations of "abnormal-true" polarity. Experience indicates that humans + are better at interpreting the "positive" polarity.) + + * Condition names should clearly indicate what "True" and "False" mean. This + may require adjusting condition type names, for example changing the type + name from "QuotaExhausted" to "ResourcesAvailable". + + * The `Unknown` status is the default, and typically indicates that + reconciliation has not yet finished (or that the resource state may not + yet be observable). + +* It's helpful to have a common top-level condition which summarizes more + detailed conditions. Simple consumers may simply query the top-level + condition. The `Ready` and `Succeeded` condition types may be used for + long-running and bounded-execution objects, respectively. The `FooCondition` type for some resource type `Foo` may include a subset of the following fields, but must contain at least `type` and `status` fields: @@ -347,20 +377,10 @@ Use of the `Reason` field is encouraged. Use the `LastHeartbeatTime` with great caution - frequent changes to this field can cause a large fan-out effect for some resources. -Conditions should be added to explicitly convey properties that users and -components care about rather than requiring those properties to be inferred from -other observations. Once defined, the meaning of a Condition can not be -changed arbitrarily - it becomes part of the API, and has the same backwards- -and forwards-compatibility concerns of any other part of the API. - Condition status values may be `True`, `False`, or `Unknown`. The absence of a condition should be interpreted the same as `Unknown`. How controllers handle `Unknown` depends on the Condition in question. -Condition types should indicate state in the "abnormal-true" polarity. For -example, if the condition indicates when a policy is invalid, the "is valid" -case is probably the norm, so the condition should be called "Invalid". - The thinking around conditions has evolved over time, so there are several non-normative examples in wide use. @@ -374,9 +394,9 @@ and should assume an Open World. An example of an oscillating condition type is `Ready` (despite it running afoul of current guidance), which indicates the object was believed to be fully operational at the time it was last probed. A possible monotonic condition -could be `Failed`. A `True` status for `Failed` would imply failure with no -retry. An object that was still active would generally not have a `Failed` -condition. +could be `Succeeded`. A `True` status for `Succeeded` would imply completion +and that the resource was no longer active. An object that was still active +would generally not have a `Succeeded` condition. Some resources in the v1 API contain fields called **`phase`**, and associated `message`, `reason`, and other status fields. The pattern of using `phase` is From 5cab966cc53280e68213b428d1a146dc10e2b470 Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Fri, 27 Mar 2020 14:41:41 -0700 Subject: [PATCH 2/7] Drop positive-polarity exception for Ready --- .../devel/sig-architecture/api-conventions.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index 578046b5c63..986dfd74575 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -391,12 +391,12 @@ we define comprehensive state machines for objects, nor behaviors associated with state transitions. The system is level-based rather than edge-triggered, and should assume an Open World. -An example of an oscillating condition type is `Ready` (despite it running -afoul of current guidance), which indicates the object was believed to be fully -operational at the time it was last probed. A possible monotonic condition -could be `Succeeded`. A `True` status for `Succeeded` would imply completion -and that the resource was no longer active. An object that was still active -would generally not have a `Succeeded` condition. +An example of an oscillating condition type is `Ready`, which indicates the +object was believed to be fully operational at the time it was last probed. A +possible monotonic condition could be `Succeeded`. A `True` status for +`Succeeded` would imply completion and that the resource was no longer +active. An object that was still active would generally not have a `Succeeded` +condition. Some resources in the v1 API contain fields called **`phase`**, and associated `message`, `reason`, and other status fields. The pattern of using `phase` is From 49d3a5aad7d2d98ecf5272301f7c2021c0cfec56 Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Fri, 17 Apr 2020 15:14:26 -0700 Subject: [PATCH 3/7] Clarify language thanks to liggitt --- .../devel/sig-architecture/api-conventions.md | 26 ++++++++++--------- 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index 986dfd74575..535edac6663 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -331,24 +331,26 @@ conventions: changed arbitrarily - it becomes part of the API, and has the same backwards- and forwards-compatibility concerns of any other part of the API. -* Conditions should have a consistent polarity. **A "positive" polarity where - "True" indicates normal operation and "False" indicates a failure is - recommended.** (Note that this is an explicit reversal from earlier - recommendations of "abnormal-true" polarity. Experience indicates that humans - are better at interpreting the "positive" polarity.) +* Within a resource, Conditions should have a consistent polarity. **A + "positive" polarity where "True" indicates normal operation and "False" + indicates a failure is recommended if there is not an established polarity.** + (Note that this is an explicit reversal from earlier recommendations of + "abnormal-true" polarity. Experience indicates that humans are better at + interpreting the "positive" polarity.) * Condition names should clearly indicate what "True" and "False" mean. This may require adjusting condition type names, for example changing the type name from "QuotaExhausted" to "ResourcesAvailable". - * The `Unknown` status is the default, and typically indicates that - reconciliation has not yet finished (or that the resource state may not - yet be observable). + * The absence of a condition status should be interpreted the same as + `Unknown`, and typically indicates that reconciliation has not yet finished + (or that the resource state may not yet be observable). -* It's helpful to have a common top-level condition which summarizes more - detailed conditions. Simple consumers may simply query the top-level - condition. The `Ready` and `Succeeded` condition types may be used for - long-running and bounded-execution objects, respectively. +* When designing Conditions for a resource, it's helpful to have a common + top-level condition which summarizes more detailed conditions. Simple + consumers may simply query the top-level condition. Although they are not a + consistent standard, the `Ready` and `Succeeded` condition types may be used + by API designers for long-running and bounded-execution objects, respectively. The `FooCondition` type for some resource type `Foo` may include a subset of the following fields, but must contain at least `type` and `status` fields: From 5f972cff41c67e4c60528ab409ae22510ceebb52 Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Thu, 25 Jun 2020 23:48:12 -0700 Subject: [PATCH 4/7] Updated with general consensus around resource naming, removed positive-polarity preference --- .../devel/sig-architecture/api-conventions.md | 57 +++++++++++++------ 1 file changed, 39 insertions(+), 18 deletions(-) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index 535edac6663..119746589df 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -322,8 +322,9 @@ controller. Objects may report multiple conditions, and new types of conditions may be added in the future or by 3rd party controllers. Therefore, conditions are represented using a list/slice of objects, where each condition has a similar -structure. Conditions are most useful when they follow some consistent -conventions: +structure. This collected should be treated as a map with a key of `type`. + +Conditions are most useful when they follow some consistent conventions: * Conditions should be added to explicitly convey properties that users and components care about rather than requiring those properties to be inferred @@ -331,20 +332,37 @@ conventions: changed arbitrarily - it becomes part of the API, and has the same backwards- and forwards-compatibility concerns of any other part of the API. -* Within a resource, Conditions should have a consistent polarity. **A - "positive" polarity where "True" indicates normal operation and "False" - indicates a failure is recommended if there is not an established polarity.** - (Note that this is an explicit reversal from earlier recommendations of - "abnormal-true" polarity. Experience indicates that humans are better at - interpreting the "positive" polarity.) - - * Condition names should clearly indicate what "True" and "False" mean. This - may require adjusting condition type names, for example changing the type - name from "QuotaExhausted" to "ResourcesAvailable". - - * The absence of a condition status should be interpreted the same as - `Unknown`, and typically indicates that reconciliation has not yet finished - (or that the resource state may not yet be observable). +* Controllers should apply their conditions to a resource the first time they + visit the resource, even if the `status` is Unknown. This allows other + components in the system to know that the condition exists and the controller + is making progress on reconciling that resource. + + * Not all controllers will observe the previous advice about reporting + "Unknown" or "False" values. For known conditions, the absence of a + condition status should be interpreted the same as `Unknown`, and + typically indicates that reconciliation has not yet finished (or that the + resource state may not yet be observable). + +* For some conditions, `True` represents normal operation, and for some + conditions, `False` represents normal operation. ("Normal-true" conditions + are sometimes said to have "positive polarity", and "normal-false" conditions + are said to have "negative polarity".) Without further knowledge of the + conditions, it is not possible to compute a generic summary of the conditions + on a resource. + +* Condition type names should make sense for humans; neither positive nor + negative polarity can be recommended as a general rule. A negative condition + like "MemoryExhausted" may be easier for humans to understand than + "SufficientMemory". Conversely, "Ready" or "Succeeded" may be easier to + understand than "Failed", because "Failed=Unknown" or "Failed=False" may + cause double-negative confusion. + +* Condition type names should describe the current observed state of the + resource, rather than describing the current state transitions. This + typically means that the name should be an adjective ("Ready", "OutOfDisk") + or a past-tense verb ("Succeeded", "Failed") rather than a present-tense verb + ("Deploying"). Intermediate states may be indicated by setting the status of + the condition to `Unknown`. * When designing Conditions for a resource, it's helpful to have a common top-level condition which summarizes more detailed conditions. Simple @@ -379,6 +397,9 @@ Use of the `Reason` field is encouraged. Use the `LastHeartbeatTime` with great caution - frequent changes to this field can cause a large fan-out effect for some resources. +Condition types should be named in PascalCase. Short condition names are +preferred (e.g. "Ready" over "MyResourceReady"). + Condition status values may be `True`, `False`, or `Unknown`. The absence of a condition should be interpreted the same as `Unknown`. How controllers handle `Unknown` depends on the Condition in question. @@ -397,8 +418,8 @@ An example of an oscillating condition type is `Ready`, which indicates the object was believed to be fully operational at the time it was last probed. A possible monotonic condition could be `Succeeded`. A `True` status for `Succeeded` would imply completion and that the resource was no longer -active. An object that was still active would generally not have a `Succeeded` -condition. +active. An object that was still active would generally have a `Succeeded` +condition with status `Unknown`. Some resources in the v1 API contain fields called **`phase`**, and associated `message`, `reason`, and other status fields. The pattern of using `phase` is From 305d8d8ac0e5f4785539170a012a63b290aea7f7 Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Fri, 3 Jul 2020 11:18:37 -0700 Subject: [PATCH 5/7] Fix typo --- contributors/devel/sig-architecture/api-conventions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index 119746589df..d8d4a195001 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -322,7 +322,7 @@ controller. Objects may report multiple conditions, and new types of conditions may be added in the future or by 3rd party controllers. Therefore, conditions are represented using a list/slice of objects, where each condition has a similar -structure. This collected should be treated as a map with a key of `type`. +structure. This collection should be treated as a map with a key of `type`. Conditions are most useful when they follow some consistent conventions: From dd082f4d2c418c748aad6a622671b2faed2707ad Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Thu, 17 Sep 2020 13:51:49 -0700 Subject: [PATCH 6/7] Add guidance on conditions for long-running reconciliations --- contributors/devel/sig-architecture/api-conventions.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index d8d4a195001..0d42ff5edc1 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -364,6 +364,14 @@ Conditions are most useful when they follow some consistent conventions: ("Deploying"). Intermediate states may be indicated by setting the status of the condition to `Unknown`. + * For state transitions which take a long period of time, use a condition + which indicates that the transition succeeded ("Provisioned" or + "SizeAllocated") with an intermediate state of `Unknown`. Note that this + pattern makes an explicit distinction between "condition not set" (which + means the reconciler hasn't seen the latest state) and "condition set to + Unknown" (which means that the reconciler has seen the state but hasn't + finished the reconciliation). + * When designing Conditions for a resource, it's helpful to have a common top-level condition which summarizes more detailed conditions. Simple consumers may simply query the top-level condition. Although they are not a From 0a9d1ebe84b23e04367733066319097da45b3dab Mon Sep 17 00:00:00 2001 From: Evan Anderson Date: Wed, 23 Sep 2020 22:16:45 -0700 Subject: [PATCH 7/7] Adjust with suggested language from lavalamp --- .../devel/sig-architecture/api-conventions.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/contributors/devel/sig-architecture/api-conventions.md b/contributors/devel/sig-architecture/api-conventions.md index 0d42ff5edc1..93f16b62cd6 100644 --- a/contributors/devel/sig-architecture/api-conventions.md +++ b/contributors/devel/sig-architecture/api-conventions.md @@ -364,13 +364,15 @@ Conditions are most useful when they follow some consistent conventions: ("Deploying"). Intermediate states may be indicated by setting the status of the condition to `Unknown`. - * For state transitions which take a long period of time, use a condition - which indicates that the transition succeeded ("Provisioned" or - "SizeAllocated") with an intermediate state of `Unknown`. Note that this - pattern makes an explicit distinction between "condition not set" (which - means the reconciler hasn't seen the latest state) and "condition set to - Unknown" (which means that the reconciler has seen the state but hasn't - finished the reconciliation). + * For state transitions which take a long period of time (rule of thumb: > 1 + minute), it is reasonable to treat the transition itself as an observed + state. In these cases, the Condition (such as "Resizing") itself should not + be transient, and should instead be signalled using the + `True`/`False`/`Unknown` pattern. This allows other observers to determine + the last update from the controller, whether successful or failed. In cases + where the state transition is unable to complete and continued + reconciliation is not feasible, the Reason and Message should be used to + indicate that the transition failed. * When designing Conditions for a resource, it's helpful to have a common top-level condition which summarizes more detailed conditions. Simple