-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readiness doesn't account for metadata.observedGeneration #3474
Comments
@tombentley and others, I don't quite follow what is needed here. The current Readiness check is just a point in time check - there's no expectation that it will wait for a controller to act. Can you provide additional context about where you are using the readiness - and if you need that to act over time could that be satisfied instead with waitUntilReady or waitUntilCondition? |
@shawkins sorry, your reply got lost in my inbox... The problem is a resource might:
Kubernetes uses the status.observedGeneration to provide a way of telling if the status is out of date with respect to the spec (not all built-in resources have an Fabric8's |
Sorry, I think most of my confusion is in terms of marking this as a bug. It sounds more like you are looking for an enhancement or should be implemented with your own condition. To make sure we're on the same page there are three states you are interested in:
However this additional state may not be meaningful depending on your design. If you are instead reacting to the events from the resource in question, you will get an event once it updates the status based upon the spec modification. That new event will cause you to re-evaluate the readiness. The difficulty with putting something like this into the Readiness utility is:
This issue could be highly related to #2718 - in that making the Readiness check pluggable might give you the dsl support you want. |
@shawkins I guess I'm wondering why, for those resource kinds where But then I don't really think of readiness as a three-valued quality. To me, a resource is only ready when The fact that |
It's not a false positive - readiness in general is independent of spec generations. For example when a resource provides a status condition for readiness, there is no guideline that a user should also update that status condition when they modify the spec. The resource remains in its given readiness state until the controller determines a change is needed - and just because there is a spec modification doesn't mean that there will be a transition. I'm also trying to convey that from an eventually consistent perspective it shouldn't matter if there are possible pending transitions in readiness or other checks - you know that it will be re-evaluated once the appropriate event is processed or the next timed reconciliation loop occurs. @manusa @rohanKanojia can you weigh in here in case I'm off base? Can you elaborate on how your logic needs to handle this situation? What does your logic do if something is not ready? And if it would be sufficient to support #2718 to allow for extensions / customizations to the Readiness checks?
I agree that observedGeneration should be handled in fabric8 - we need to get with the operator sdk folks on a shared solution. I'm just not sure about how or whether it belongs in the readiness check. |
In our use case we've just updated the resource and we want to know that the change has taken effect, and the resource is still ready after the change has been applied, before proceeding with our operator logic. We initially just checked for the ready status, but that was racy for precisely the reasons discussed here, so we added our own check against the observed generation (e.g. strimzi/strimzi-kafka-operator#1814), and it was when we were looking at refactoring that in strimzi/strimzi-kafka-operator#5525 that I started with wonder why we were having to do this ourselves. I take your point about whether a change which doesn't actually change the readiness should be counted as a transition. For our purposes we don't care -- it's the eventual state that we're waiting for, but other use cases might differ. |
From my point of view I agree with this statement:
I'm not sure about this, but it's my understanding that there's no standard defining the "readiness" of a specific resource. This is a subjective concept. So IMHO a resource is ready when the subjective condition is matched and the status where that condition is checked is consistent with the current generation/version/lock of the resource (reconciled). (again I'm assuming there's no spec defining this). |
After a meeting with @shawkins the following point was clarified:
A spec change will imply a status change once the controller processes the resource. As stated, the fact that the spec has changed doesn't imply that the resource is unready (maybe the change doesn't affect the readiness condition for that given resource). It's only after the controller has acted that we should then verify the readiness (stale status !== unready state). As a future step, in order to provide better support for the observedGeneration field, we could add the interface to our Status classes. The Golang generators could analyze the status structs and mark the status with the new interface in case the field exists. This would be helpful for downstream projects (e.g. JOSDK #563 #628) (@csviri @metacosm) |
Readiness is a tricky issue and I have been trying to figure out what the best practices are around the topic at some point and promptly gave up because the information I could find was contradictory or outdated… I have read https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties but these have changed over time so it's difficult to get an accurate picture of what people are supposed to do… then you see this: kubernetes/community#4521 and comments such as this one: https://github.com/kubernetes/community/pull/4521/files#r413087880… in essence, if you're not following the issue closely, good luck figuring things out :) |
This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions! |
Readiness
andOpenshiftReadiness
don't check, when using thestatus
of resources, whether thestatus.observedGeneration==metadata.generation
. Therefore there is the risk that, if the resource is updated and then readiness-checked before a controller is able to act on the update, the readiness will be evaluated incorrectly.The text was updated successfully, but these errors were encountered: