-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grid: Plumb the AlternateProtocolCache down to the grid from the UpstreamClusterManager #16385
grid: Plumb the AlternateProtocolCache down to the grid from the UpstreamClusterManager #16385
Conversation
…nfig AlternateProtocolsCaches Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
Signed-off-by: Ryan Hamilton <[email protected]>
/assign @DavidSchinazi |
@DavidSchinazi I need to discuss with Alyssa about how best to integration test this, but if you could do a first pass in the meantime, I'd appreciate it. |
Signed-off-by: Ryan Hamilton <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Wow that was a lot of plumbing
|
||
// [#not-implemented-hide:] | ||
// If HTTP/3 is present and an alternate protocol cache is present, attempts to connect | ||
// will be via HTTP/3 only if the alternate protocol cache specifies an HTTP/3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time parsing this comment. What are you trying to say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.. So if you provide alternate protocols cache options and you enable HTTP/3, you won't get HTTP/3. You also need an entry in the cache which says that the particular server advertises HTTP/3. I'd love a recommendation for better phrasing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the comment here is hard to digest.
IIUC this field is not specific to HTTP/3, but the comment doesn't describe what's its purpose, and how it impacts the other protocols.
Also, not an expert here, but can the same cache in this case be for both HTTP/1.1 and HTTP/2 (or is it a different cache with a different configuration)?
If the cache isn't shared across the protocols, would it make sense to have this message inside each of the different Protocol-Options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about having the comment focus on the main logic, and only noting the caveat. For example: The presence of alternate protocols cache options causes the use of the alternate protocols cache, which is responsible for parsing and caching HTTP Alt-Svc headers. This enables the use of HTTP/3 for origins that advertise supporting it. Note that currently when the alternate protocols cache is absent we will always attempt HTTP/3 if HTTP/3 is enabled, but that behavior will be removed soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought if HTTP/3 was present we were going to require the cache to be present? Even if we TODO enforcement if that's the plan I think we can go with it, and comment that this will become required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the plan is that the cache be required when HTTP/3 is present in the auto pool. With this PR, of course, the cache is still optional. Also with this PR, the cache is not being populated by response headers. (That will come subsequently.) So if we want to document behavior in this PR, we should probably not talk about header parsing but would want to talk about missing cache behavior. If we want to document intended behavior, we should probably not talk about missing cache behavior but should probably talk about header parsing.
What do you both think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's doc up what the actual plan is, since it's hidden so the APIs won't be turned into docs.
You can TODO here or in code to enforce the bits which aren't done yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me! Done! (I hope :>) I took David's language and tweaked it slightly.
@@ -71,6 +71,24 @@ message UpstreamHttpProtocolOptions { | |||
bool auto_san_validation = 2; | |||
} | |||
|
|||
message AlternateProtocolsCacheOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This comment not specific to your PR, just curiosity.) Are the protos in generated_api_shadow
automatically generated? If yes do you know why they're in the repository as opposed to generated at build time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed this one. They're generated by running:
./tools/proto_format/proto_format.sh fix
I don't understand why that doesn't happen at build time, but it is SUPER slow so that's probably part of it :/
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
||
// [#not-implemented-hide:] | ||
// If HTTP/3 is present and an alternate protocol cache is present, attempts to connect | ||
// will be via HTTP/3 only if the alternate protocol cache specifies an HTTP/3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.. So if you provide alternate protocols cache options and you enable HTTP/3, you won't get HTTP/3. You also need an entry in the cache which says that the particular server advertises HTTP/3. I'd love a recommendation for better phrasing.
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
/retest |
Retrying Azure Pipelines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this.
Added an API question here.
|
||
// [#not-implemented-hide:] | ||
// If HTTP/3 is present and an alternate protocol cache is present, attempts to connect | ||
// will be via HTTP/3 only if the alternate protocol cache specifies an HTTP/3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the comment here is hard to digest.
IIUC this field is not specific to HTTP/3, but the comment doesn't describe what's its purpose, and how it impacts the other protocols.
Also, not an expert here, but can the same cache in this case be for both HTTP/1.1 and HTTP/2 (or is it a different cache with a different configuration)?
If the cache isn't shared across the protocols, would it make sense to have this message inside each of the different Protocol-Options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early drive by since you haven't assigned me but I saw David's approval :-P
|
||
// [#not-implemented-hide:] | ||
// If HTTP/3 is present and an alternate protocol cache is present, attempts to connect | ||
// will be via HTTP/3 only if the alternate protocol cache specifies an HTTP/3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought if HTTP/3 was present we were going to require the cache to be present? Even if we TODO enforcement if that's the plan I think we can go with it, and comment that this will become required
// The implementation is approximate and enforced independently on each worker thread, thus | ||
// it is possible for the maximum hosts in the cache to go slightly above the configured | ||
// value depending on timing. This is similar to how other circuit breakers work. | ||
google.protobuf.UInt32Value max_hosts = 2 [(validate.rules).uint32 = {gt: 0}]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this as used yet, or a TODO to use the field. is it functionally the same TODO "Add a limit to the size of this map and evict based on usage" and added in this PR so you can test config conflicts?
If so the map is from Origin and I wonder if max_entries would be a bit more consistent than max_hosts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the same as the TODO. Changed to max_entries. Good idea.
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
|
||
// [#not-implemented-hide:] | ||
// If HTTP/3 is present and an alternate protocol cache is present, attempts to connect | ||
// will be via HTTP/3 only if the alternate protocol cache specifies an HTTP/3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that other PR - the changes here are way easier to see now!
|
||
// [#not-implemented-hide:] | ||
// If HTTP/3 is present and an alternate protocol cache is present, attempts to connect | ||
// will be via HTTP/3 only if the alternate protocol cache specifies an HTTP/3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's doc up what the actual plan is, since it's hidden so the APIs won't be turned into docs.
You can TODO here or in code to enforce the bits which aren't done yet.
if (existing_cache != caches_.end()) { | ||
if (!Protobuf::util::MessageDifferencer::Equivalent(options, existing_cache->second.options_)) { | ||
throw EnvoyException( | ||
fmt::format("options specified alternate protocols cache '{}' with different settings", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth logging both sets of options since we have them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
@@ -0,0 +1,58 @@ | |||
#include "common/http/alternate_protocols_cache_manager_impl.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do end to end tests with this PR too, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning we can test that HTTP/3 works via the AlternateProtocolsCache? I think though though I'd love to spend some time discussing how to do this in an integration test. I looked at what I think are the HTTP/3 multiplexed integrations tests and struggled to understand quite how to wire it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you found the tests (yay!) but I agree the way config happens is pretty opaque.
The important thing is that "use_alpn" in the set up here, which gets passed to ConfigHelper::configureUpstreamTls()
Since we plan on making it required, I'd be nice to add default alpn cache by default, and tweak the existing 2 tests using the grid but the tweaking is going to be the tricky bit.
right now, we create exactly one fake upstream, either http2 or http3. now that we'll have alt-svc I guess we'd configure the HTTP/2 upstream as the upstream of choice, and it'll alt-svc to the http3 upstream.
I think you can decompose what happens in MixedUpstreamIntegrationTest::createUpstreams which currently creates either h2 or h3 to create h2 then h3 (just skip the base class call and do 'em manually I think, and create http3 as well unless the test specifies testing http3 broken)
the changes to do alt-svc, and wait for h3 to be established may get complicated, so happy to help out there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline... Since we don't yet have anything reading alt-svc headers, we can't really do a bona-fide end-to-end test. Once we wire up alt-svc header processing, we'll do that.
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yay docs! Thanks for kicking those off :-)
// HTTP Alt-Svc headers. This enables the use of HTTP/3 for origins that | ||
// advertise supporting it. | ||
// TODO(RyanTheOptimist): Make this field required when HTTP/3 is enabled. | ||
config.core.v3.AlternateProtocolsCacheOptions alternate_protocols_cache_options = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here or message definition link to spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if (!Protobuf::util::MessageDifferencer::Equivalent(options, existing_cache->second.options_)) { | ||
throw EnvoyException(fmt::format( | ||
"options specified alternate protocols cache '{}' with different settings" | ||
" old '{}' new '{}'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on second thought, old/new -> first/second? otherwise it maybe looks like a config reload. Sorry for not thinking of that up front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done!
|
||
#### Connectivity Grid | ||
The `ConnectivityGrid` is a `ConnectionPool` which wraps a `Http3ConnPoolImpl` connection pool | ||
(QUIC) and a `HttpConnPoolImplMixed` connection pool (TCP). If HTTP/3 is currently broken, stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you have much of that here. Maybe again move it up top and then just call out the grid handles which of TCP/UDP to try etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to keep this text here as it has a bit more technical detail than what I added about, but happy to nuke / rephrase if you think that would be better?
source/docs/http3_upstream.md
Outdated
|
||
### Required configuration | ||
|
||
Specific Envoy configuration is required in order to enable the previous. components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can call this out in the top overview section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. PTAL
@@ -0,0 +1,58 @@ | |||
#include "common/http/alternate_protocols_cache_manager_impl.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you found the tests (yay!) but I agree the way config happens is pretty opaque.
The important thing is that "use_alpn" in the set up here, which gets passed to ConfigHelper::configureUpstreamTls()
Since we plan on making it required, I'd be nice to add default alpn cache by default, and tweak the existing 2 tests using the grid but the tweaking is going to be the tricky bit.
right now, we create exactly one fake upstream, either http2 or http3. now that we'll have alt-svc I guess we'd configure the HTTP/2 upstream as the upstream of choice, and it'll alt-svc to the http3 upstream.
I think you can decompose what happens in MixedUpstreamIntegrationTest::createUpstreams which currently creates either h2 or h3 to create h2 then h3 (just skip the base class call and do 'em manually I think, and create http3 as well unless the test specifies testing http3 broken)
the changes to do alt-svc, and wait for h3 to be established may get complicated, so happy to help out there
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
/lgtm api |
I'm back to under the weather and so I'm taking today off. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments but otherwise this looks good!
source/common/upstream/BUILD
Outdated
@@ -56,6 +56,7 @@ envoy_cc_library( | |||
":subset_lb_lib", | |||
"//include/envoy/api:api_interface", | |||
"//include/envoy/event:dispatcher_interface", | |||
"@envoy_api//envoy/extensions/clusters/dynamic_forward_proxy/v3:pkg_cc_proto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit odd, is this supposed to get added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point. Removed.
source/docs/http3_upstream.md
Outdated
|
||
### Required configuration | ||
|
||
Specific Envoy configuration is required in order to enable the previous. components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence seems a bit off, is the period misplaced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah. Yes, I think so. Cleaned up.
try { | ||
AlternateProtocolsCacheSharedPtr cache2 = manager_->getCache(options2_); | ||
FAIL(); | ||
} catch (const EnvoyException& e) { | ||
EXPECT_THAT(e.what(), | ||
testing::HasSubstr( | ||
"options specified alternate protocols cache 'name1' with different settings")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some helpers for this, EXPECT_THROW_*
et al
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, awesome! Much simpler. Thanks!
Signed-off-by: Ryan Hamilton <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments!
source/common/upstream/BUILD
Outdated
@@ -56,6 +56,7 @@ envoy_cc_library( | |||
":subset_lb_lib", | |||
"//include/envoy/api:api_interface", | |||
"//include/envoy/event:dispatcher_interface", | |||
"@envoy_api//envoy/extensions/clusters/dynamic_forward_proxy/v3:pkg_cc_proto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point. Removed.
source/docs/http3_upstream.md
Outdated
|
||
### Required configuration | ||
|
||
Specific Envoy configuration is required in order to enable the previous. components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah. Yes, I think so. Cleaned up.
try { | ||
AlternateProtocolsCacheSharedPtr cache2 = manager_->getCache(options2_); | ||
FAIL(); | ||
} catch (const EnvoyException& e) { | ||
EXPECT_THAT(e.what(), | ||
testing::HasSubstr( | ||
"options specified alternate protocols cache 'name1' with different settings")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, awesome! Much simpler. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo two docs nits!
source/docs/http3_upstream.md
Outdated
@@ -1,7 +1,17 @@ | |||
### Overview | |||
|
|||
Support for Upstream HTTP/3 connections is slightly more complex than for HTTP/1 or HTTP/2 | |||
and this document attempts to describe it. | |||
and requires specific configurations to enable it. HTTP/3 is only attempted to servers which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If HTTP/3 is configured in auto_config, HTTP/3 connections will only be attempted
Since it is force-attempted for explicit_http_config and "do what downstream did"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Done.
source/docs/http3_upstream.md
Outdated
If no such advertisement exists, then HTTP/2 or HTTP/1 will be used instead. Further, | ||
HTTP/3 runs over QUIC (which uses UDP) and not over TCP (which HTTP/1 and HTTP/2 use). | ||
It is not uncommon for network devices to block UDP traffic, and hence block HTTP/3. This | ||
means that Upstream HTTP/3 connection attempts might be blocked by the network and should fall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upstream here and below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Signed-off-by: Ryan Hamilton <[email protected]>
source/docs/http3_upstream.md
Outdated
means that Upstream HTTP/3 connection attempts might be blocked by the network and should fall | ||
back to using HTTP/2. On such networks, the Upstream connection code needs to | ||
means that upstream HTTP/3 connection attempts might be blocked by the network and should fall | ||
back to using HTTP/2. On such networks, the upstream connection code needs to | ||
track that HTTP/3 connects attempts are not succeeding and avoid making connections | ||
which are doomed to fail. On networks where HTTP/3 is working correctly, however, the | ||
Upstream connection code should avoid attempting HTTP/2 unnecessarily. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upstream -> upstream :-P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! (facepalm)
Signed-off-by: Ryan Hamilton <[email protected]>
/retest |
Retrying Azure Pipelines: |
@snowp any other comments or am I good to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…reamClusterManager (envoyproxy#16385) grid: Plumb the AlternateProtocolCache down to the grid from the UpstreamClusterManager. Create a new AlternateProtocolsCacheOptions proto message for configuring an AlternateProtocolCache, and add this message to AutoHttpConfig. Create a new AlternateProtocolCacheManager class for fetching/creating a Cache based on the config. Modify the AlternateProtocolCache to store state per thread. (Does not yet synchronize this state across threads) Risk Level: Low Testing: Unit tests Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Signed-off-by: Ryan Hamilton <[email protected]> Signed-off-by: Sixiang Gu <[email protected]>
…reamClusterManager (envoyproxy#16385) grid: Plumb the AlternateProtocolCache down to the grid from the UpstreamClusterManager. Create a new AlternateProtocolsCacheOptions proto message for configuring an AlternateProtocolCache, and add this message to AutoHttpConfig. Create a new AlternateProtocolCacheManager class for fetching/creating a Cache based on the config. Modify the AlternateProtocolCache to store state per thread. (Does not yet synchronize this state across threads) Risk Level: Low Testing: Unit tests Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Signed-off-by: Ryan Hamilton <[email protected]>
grid: Plumb the AlternateProtocolCache down to the grid from the UpstreamClusterManager.
Create a new AlternateProtocolsCacheOptions proto message for configuring an AlternateProtocolCache, and add this message to AutoHttpConfig.
Create a new AlternateProtocolCacheManager class for fetching/creating a Cache based on the config.
Modify the AlternateProtocolCache to store state per thread. (Does not yet synchronize this state across threads)
Risk Level: Low
Testing: Unit tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A