-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconsider xDS API lifecycle clock #10852
Comments
+1 to the sentiment. Major version bumps in APIs are costly for the entire ecosystem. If there is a capability to use "experimental" tags to allow new features to be exempted from backward compatibility guarantees, then you should almost never need to release a new major version. You call Envoy "fast moving", yet it has been GA for 3.5 years and is used by many major companies for critical services - at what point will it be considered "mature", and plan for no major version bumps? |
I'm also in favor of not imposing the overhead of a major version bump without compelling justification. It's become clear that this change is extremely expensive. I agree with @dfawley that in principle, we should never need to bump the major version number to add new features, because new features can instead be triggered by client capabilities. So it seems like the only reason we would ever have to bump the major version would be to eliminate deprecated fields, so that management servers can stop supporting old fields. And for that, we could just wait until enough deprecated fields have built up that it makes sense to get rid of a whole bunch of them at a time. @ejona86 may also have thoughts here. |
Agreed, though I think at this point we should agree as a community to push through with v2 -> v3 since we don't know what we don't know until we do it once, and arguably the longer we wait to try this the more painful it will get.
+1 this is my thinking as well. |
I'm not convinced that it actually makes sense to go through with the v2 -> v3 migration right now. I agree that we would learn something from the exercise, but until we actually need to bump the version for some reason, it seems like this is an awful lot of work just to gain theoretical knowledge. It seems better to wait until we actually need to make the change, because then it will be much easier to devote the resources to make it happen. |
If this is an opinion held by many I think we need to urgently discuss this as a lot of plans would need to change and be communicated. |
(Also note that if we abandon the v3 force upgrade we need to actually go back and backfill recent API changes that have been v3 only, so again this needs urgent attention if there is going to be a change here in the POR) |
@markdroth are you talking about for gRPC or Envoy? I think it wold be a major disruption to back out from v3 now in Envoy, since:
This should be tempered with the real cost of the v3 migration, but I think this is the wrong point in an API major version shift to be debating this. Ideally this happens at the start. |
I agree that there are a lot of implications here and that we need to resolve this quickly, and I'm not dead-set against continuing the migration to v3 if that's still the right course of action. We'll have some offline discussions and try to come to consensus on this. |
After extended discussion in envoyproxy#10852, Slack and offline, this patch proposes a revision to the API major versioning policy where we will: * Not mechanically cut a new major version at EOY, instead wait for enough tech debt. * Encourage the use of client feature capabilities as an alternative to manage client feature support. Fixes envoyproxy#10852. Signed-off-by: Harvey Tuch <[email protected]>
After extended discussion in #10852, Slack and offline, this patch proposes a revision to the API major versioning policy where we will: * Not mechanically cut a new major version at EOY, instead wait for enough tech debt. * Point to future minor versioning and client capabilities to help deal with tech debt. Fixes #10852. Signed-off-by: Harvey Tuch <[email protected]>
The move to the v3 APIs has exposed some pain points that control plane authors are facing. It's not cheap to switch major version, in particular when lacking some of the generic API migration tooling that we have developed inside of Envoy.
The existing API lifecycle is described at https://github.com/envoyproxy/envoy/blob/master/api/API_VERSIONING.md. We cut a new major version every year, turn down an old one each year and any major version lives at most 2 years.
This means that any technical debt in API and associated code can live up to 2 years. If we change the clock cycle to every 2 years for a new major version, we might end up having technical debt live 3 or 4 years. This seems pretty significant for a fast moving project like Envoy.
@mattklein123 has proposed that we postpone cutting a new version until we have enough technical debt built up. Arguably we could keep the existing lifecycle with that approach, but instead of cutting a new API exactly every year, we have >= 1 year between major versions and have Envoy maintainers vote on whether to cut a new major version at every quarter after the year mark. This gives us both flexibility in cutting new major versions, potentially longer periods before inflicting cost on control plane authors, stability guarantees to other xDS clients, a predictable deprecation cycle, while leaving technical debt management under the control of maintainers.
This issue can track discussion, will add this to the coming community call agenda.
CC @mattklein123 @envoyproxy/api-shepherds @alyssawilk @markdroth @dfawley
The text was updated successfully, but these errors were encountered: