Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

States, transitions, and conditions #1658

Closed
vincepri opened this issue Oct 25, 2019 · 10 comments
Closed

States, transitions, and conditions #1658

vincepri opened this issue Oct 25, 2019 · 10 comments
Assignees
Labels
area/api Issues or PRs related to the APIs kind/proposal Issues or PRs related to proposals. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@vincepri
Copy link
Member

vincepri commented Oct 25, 2019

Opening this issue as a placeholder, to be filled as time goes by. We know that the current model has gaps in terms of Status.Phase field, its use and shortcomings. We'll be tracking progress towards a proposal and better model within this issue. Feel free to add more information, comments, concerns.

/kind proposal
/milestone Next
/priority important-longterm
/assign @timothysc

@k8s-ci-robot k8s-ci-robot added this to the Next milestone Oct 25, 2019
@k8s-ci-robot k8s-ci-robot added kind/proposal Issues or PRs related to proposals. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Oct 25, 2019
@detiber
Copy link
Member

detiber commented Nov 7, 2019

Some controversy over lists of named subobjects vs use of maps

@jayunit100
Copy link
Contributor

jayunit100 commented Jan 23, 2020

Just getting into CAPI internals so could be off - but - as an interim solution with long term dashboard value --- Maybe for now just putting the right metrics for API calls in place will be a good enough workaround ---

  • labels for the providers based on infrastructureRef will give you provider specific states
  • labels such as still_awaiting_cloud_controller
  • update docs to say 'if your cluster isnt converging - curl localhost:9090 | grep cloud_controller and look at the numbers'

If this issue is adequetely resourced, then the status/phase implementation is also great... just wanted to make sure i offered up a lightweight workaround in case this might take a while.

@evankanderson
Copy link

evankanderson commented Feb 11, 2020

Some drive-by comments (feel free to ping me at evana at vmware.com if I don't notice updates here):

We spent a long time (about 4 weeks) working on Condition design with UX designers and tooling engineers. The end result is specified here, but I'll also give some useful summaries and a link to the library we wrote:

https://knative.dev/docs/serving/spec/knative-api-specification-1.0/#error-signalling

Some of the lessons we learned:

  • Having a consistent polarity for Conditions (i.e. "True = good" or "False = good") is really helpful for tooling, particularly if you want the server side to be able to add or change conditions without needing to go through and coordinate with every single tool that works with the API.
    • It's important to name the conditions so that it's clear what "True" and "False" means. Sometimes this means you need to be creative, i.e. "ResourcesAvailable" even if you're really thinking of it as "QuotaExhausted" (and definitely don't call the condition "Quota" without indicating what True and False mean).
  • It's really useful to have an agreed-on top-level condition that summarizes all the other conditions (again, it simplifies work for tool authors). In Knative, we defined two top-level conditions. Each resource should have one or the other of these conditions (and not both):
    • Ready, for resources which represent an ongoing status, like a Deployment
    • Succeeded, for resources which run-to-completion, like a Job
  • Every resource except data-only objects which aren't reconciled (like ConfigMap) must have conditions. This avoids situations like NetworkPolicy, where you can't tell if the cluster is actually able to support the NetworkPolicy based on the CNI plugin used.
  • Our experience with actual users suggested that the "True = good" polarity was read more naturally than "False = good". In particular, "FooFailed = False" tended to be read as a problem or a negative situation, rather than a successful state. This is particularly an issue for users in stressful situations (new users, live demos, handling production outages, etc).
  • Knative also added a "severity" field to conditions to handle providing warnings. Note that the default severity was "Error", which we ended up spelling as "" (empty string). See the previous point for the rationale. (Multiple senior PMs and engineers were thrown off by:
    conditions:
    - type=Ready
      severity=Error
      status=True
    Severity may be overkill for your needs; we found some cases where it was helpful to be able to pass back information in conditions that didn't indicate an error.

Library implementing consistent ("True = good") polarity, condition aggregation to common top-level: https://godoc.org/github.com/knative/pkg/apis#ConditionSet.Manage

@vincepri
Copy link
Member Author

Thank you so much @evankanderson for sharing your experience with conditions! After v1alpha3 is wrapped up, we'll start working on a proposal, the context you provided is going to be super helpful.

@vincepri
Copy link
Member Author

vincepri commented Apr 1, 2020

/milestone v0.3.x

@k8s-ci-robot k8s-ci-robot modified the milestones: Next, v0.3.x Apr 1, 2020
@vincepri
Copy link
Member Author

/assign @fabriziopandini
/lifecycle active
/area api

@k8s-ci-robot k8s-ci-robot added lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. area/api Issues or PRs related to the APIs labels Apr 20, 2020
@vincepri
Copy link
Member Author

As per April 27th 2020 community guidelines, this project follows the process outlined in https://github.com/kubernetes-sigs/cluster-api/blob/master/CONTRIBUTING.md#proposal-process-caep for large features or changes.

Following those guidelines, I'll go ahead and close this issue for now and defer to contributors interested in pushing the proposal forward to open a collaborative document proposal instead, and follow the process as described.

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closing this issue.

In response to this:

As per April 27th 2020 community guidelines, this project follows the process outlined in https://github.com/kubernetes-sigs/cluster-api/blob/master/CONTRIBUTING.md#proposal-process-caep for large features or changes.

Following those guidelines, I'll go ahead and close this issue for now and defer to contributors interested in pushing the proposal forward to open a collaborative document proposal instead, and follow the process as described.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@enxebre
Copy link
Member

enxebre commented May 5, 2020

@vincepri @ncdc This is linked in the roadmap under "Improved status conditions" https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/book/src/roadmap.md

Based on #1658 (comment)
is there a doc related to this? if so is there a place where I can find the link to that doc? I'd expect this ticket to be that place so the roadmap does not lead us to a dead end.

My intention was to add a relevant ref kubernetes/enhancements#1624 to this topic.

@fabriziopandini
Copy link
Member

new tracking issue for the CAEP is #3005
@enxebre the Kuberenetes KEP is already in the radar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Issues or PRs related to the APIs kind/proposal Issues or PRs related to proposals. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

8 participants