Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mixed v2/v3 xDS request/responses #10776

Closed
htuch opened this issue Apr 14, 2020 · 18 comments
Closed

Support mixed v2/v3 xDS request/responses #10776

htuch opened this issue Apr 14, 2020 · 18 comments
Labels
api/v3 Major version release @ end of Q3 2019 area/docs help wanted Needs help! priority/high
Milestone

Comments

@htuch
Copy link
Member

htuch commented Apr 14, 2020

To better support rollouts where the control plane switches from v2 to v3, we should be able to support a DiscoveryResponse with a v3 resource to a DiscoveryRequest for a v2 previous version resource.

Envoy may already support this. We need to look at the behavior of gRPC mux watchers, there might need to be some modifications needed there. In addition, tests and documentation are required.

@htuch htuch added api/v3 Major version release @ end of Q3 2019 help wanted Needs help! labels Apr 14, 2020
@mattklein123 mattklein123 added this to the 1.15.0 milestone Apr 14, 2020
@euroelessar
Copy link
Contributor

Will it be a mandatory part of the spec for other data planes, e.g. gRPC?

@htuch
Copy link
Member Author

htuch commented Apr 17, 2020

@markdroth thoughts on #10776 (comment)?

@htuch
Copy link
Member Author

htuch commented Apr 17, 2020

FWIW, I think to make this sane, the client would have to NACK if it didn't know about v3. I believe this is what would happen if you tried this with an older Envoy today, since it would be receiving the wrong typed resource.

@markdroth
Copy link
Contributor

@ejona86, @dfawley, and I are still in the early stages of discussing how gRPC will migrate to v3, so I don't have any concrete answers for you yet.

My off-the-cuff reaction is that it seems very strange to allow a management server to respond with what is fundamentally a different resource type (because the type_url is different) than the one requested. Wouldn't it be simpler to just change the clients to request the new versions?

@jyotimahapatra
Copy link
Contributor

jyotimahapatra commented Apr 17, 2020

I tried this with a go-control-plane integration test envoyproxy/go-control-plane#290 and looks like envoy rejects the DiscoveryResponse if the typeUrl does not match the request typeurl.
The PR sends mixed responses for v2/v3 resources.

error: https://github.com/envoyproxy/envoy/blob/master/source/common/config/grpc_mux_impl.cc#L165

[warning][config] [source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.config.cluster.v3.Cluster rejected: type.googleapis.com/envoy.api.v2.Cluster does not match the message-wide type URL type.googleapis.com/envoy.config.cluster.v3.Cluster in DiscoveryResponse version_info: "v0"

In order to get through this error, I ended up setting the DiscoveryResponse.TypeUrl to v3 resource too.
After the change the new error is

[2020-04-17 19:18:00.722][1209][debug][config] [source/common/config/grpc_mux_impl.cc:136] Received gRPC message for type.googleapis.com/envoy.config.cluster.v3.Cluster at version v0
[2020-04-17 19:18:00.722][1209][warning][config] [source/common/config/grpc_mux_impl.cc:139] Ignoring the message for type URL type.googleapis.com/envoy.config.cluster.v3.Cluster as it has no current subscribers.

https://github.com/envoyproxy/envoy/blob/master/source/common/config/grpc_mux_impl.cc#L131
Repro steps:

  1. get the go-control-plane branch from PR
  2. docker run -v ~/src/go-control-plane:/go-control-plane --cpus=6 -it gcr.io/istio-testing/go-control-plane-ci:04-02-2020 /bin/bash
  3. make integration.xds or make integration.xds.v3
  4. the logs will be in envoy.xds.log or envoy.xdsv3.log

@seflerZ
Copy link

seflerZ commented Apr 19, 2020

So this work hasn't been implemented in previous Envoy. Even if we decide to integrate this feature now, it wouldn't have much effect before several releases.

@htuch
Copy link
Member Author

htuch commented Apr 20, 2020

@seflerZ why wouldn't this be available in Istio 1.7? We will have a new Envoy release before that.

@banks
Copy link
Contributor

banks commented Apr 28, 2020

I've been trying to grok v2 -> v3 and what I should do as a control plane author to migrate. I'm still confused as it seems v3 is preferred but even if I configure Envoy 1.14 with transport_api_version: V3, that only seems to switch the Request/Response to use the v3 types while the actual resource TypeURLs requested (when using ADS) are still v2 URLs. If I attempt to deliver V3 types then I see the rejection documented above. Is it even possible to have Envoy request V3 types currently (as of 1.14/1.15)?

This might be a separate issue and potentially even a bug related to ADS so I'll happily file it separately if that's the case but I'm confused enough at this point that I'm not sure if I'm just experiencing the same issue described here and there is no no current solution, or if I'm still holding it wrong?

@htuch
Copy link
Member Author

htuch commented Apr 29, 2020

@banks have you set the resource_api_version as well?

@banks
Copy link
Contributor

banks commented Apr 29, 2020

@banks have you set the resource_api_version as well?

🤦 thanks @htuch, I was indeed holding it wrong. I looked around for some other config for resources and/or examples but somehow missed that.

Just in case anyone else hits this you need to set this under the ConfigSource (i.e. lds_config|cds_config in bootstrap) while the transport version is set in the ApiConfigSource if you're using ads_config.

@seflerZ
Copy link

seflerZ commented Apr 29, 2020

@htuch I'm a bit confused. According to this doc. The v3 objects may be transferred over the v2 transport protocol. So why do we need to support mixed response? I think we just need to do cast directly.

@htuch
Copy link
Member Author

htuch commented Apr 30, 2020

@banks I will turn this into a FAQ entry, as I agree it's a bit confusing.

@seflerZ this issue isn't about having v3 resource over v2 transport, but instead allowing a server to return a v2 resource when a v3 resource is requested (which could be over either a v2 or v3 transport).

@htuch
Copy link
Member Author

htuch commented May 7, 2020

@banks this is actually already in the FAQ at https://www.envoyproxy.io/docs/envoy/latest/faq/api/envoy_v3, so no need for another entry.

@htuch
Copy link
Member Author

htuch commented Jun 9, 2020

Tagging for collection in #10943

@htuch htuch modified the milestones: 1.15.0, 1.16.0 Jul 6, 2020
eduser25 added a commit to openservicemesh/osm that referenced this issue Jul 23, 2020
Was missing specific config for LDS/CDS, apparently it does
not inherit whatever ADS has.

envoyproxy/envoy#10776 (comment)
eduser25 added a commit to openservicemesh/osm that referenced this issue Jul 29, 2020
Was missing specific config for LDS/CDS, apparently it does
not inherit whatever ADS has.

envoyproxy/envoy#10776 (comment)
elenaspasovaspasova pushed a commit to openservicemesh/osm that referenced this issue Jul 31, 2020
Was missing specific config for LDS/CDS, apparently it does
not inherit whatever ADS has.

envoyproxy/envoy#10776 (comment)
@stevenzzzz
Copy link
Contributor

/cc stevenzzzz

@stevenzzzz
Copy link
Contributor

question here: can we have both V2 and V3 subscriptions via the same ADS, say

  1. V2 and V3 RDS subscriptions on the same resource name,
  2. V2 subscription sends a request and dies,
  3. management server sends a V2 response back,

what's the correct move now? do we allow V3 subscription to handle the response by upgrading the type-url to v3 and find a watch?
@htuch @chaoqin-li1123

@htuch
Copy link
Member Author

htuch commented Sep 4, 2020

If you want mixed requests/responses, v2 and v3 don't constitute real distinct namespaces IMHO, if you subscribe to resource Foo as v2 in RDS with resource version v2 and Foo on some other RDS with resource version v3, you get back the exact same resource in both cases, even though the management server will see distinct discovery requests. At least, if you want this interop.

Since you raise this point, I think we should runtime flag protect this, since it might confuse some existing users who have made a disjointness assumption. @chaoqin-li1123 can you add a runtime flag which basically says allow_mixed_v2_v3_responses. This should default off, and only be enabled by folks performing migrations who understanding these semantics. When off, the same behavior as exists today applies.

@stevenzzzz
Copy link
Contributor

If you want mixed requests/responses, v2 and v3 don't constitute real distinct namespaces IMHO, if you subscribe to resource Foo as v2 in RDS with resource version v2 and Foo on some other RDS with resource version v3, you get back the exact same resource in both cases, even though the management server will see distinct discovery requests. At least, if you want this interop.

Since you raise this point, I think we should runtime flag protect this, since it might confuse some existing users who have made a disjointness assumption. @chaoqin-li1123 can you add a runtime flag which basically says allow_mixed_v2_v3_responses. This should default off, and only be enabled by folks performing migrations who understanding these semantics. When off, the same behavior as exists today applies.

that sounds good, thanks!

@htuch htuch closed this as completed in c8b894c Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/v3 Major version release @ end of Q3 2019 area/docs help wanted Needs help! priority/high
Projects
None yet
Development

No branches or pull requests

8 participants