Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix http2 authority header in multiple endpoints scenario and bump up grpc from v1.51.0 to v1.52.0 #16324

Merged
merged 2 commits into from
Jul 31, 2023

Conversation

chaochn47
Copy link
Member

@chaochn47 chaochn47 commented Jul 28, 2023

Replace #15131 with endpoints.Interpret returns Host:port as ServerName. It should also help #16290

After this change, each h2 request will be sent with the proper authority header value instead of only the first one out of three dialing endpoints. This should resolve the issue mentioned in #13192 (comment).

Note: client certificate validation on server name won't break because of grpc/grpc-go#3082

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@chaochn47
Copy link
Member Author

e2e test TestAuthority also needs update since it verifies the authority header value only matches the first endpoint.

I need to replace etcdctl with etcd client or somehow send at least 3 requests to round-robin it.

TIL: run stress to validate the test is not flaky and its correctness.

2023-07-28T00:17:39.5254127Z --- FAIL: TestAuthority (37.13s)
2023-07-28T00:17:39.5254617Z     --- PASS: TestAuthority/Size:_1,_Scenario:_"unix:path" (0.76s)
2023-07-28T00:17:39.5255084Z     --- PASS: TestAuthority/Size:_3,_Scenario:_"unix:path" (1.40s)
2023-07-28T00:17:39.5255597Z     --- PASS: TestAuthority/Size:_1,_Scenario:_"unix://absolute_path" (1.06s)
2023-07-28T00:17:39.5256115Z     --- PASS: TestAuthority/Size:_3,_Scenario:_"unix://absolute_path" (2.39s)
2023-07-28T00:17:39.5256615Z     --- PASS: TestAuthority/Size:_1,_Scenario:_"unixs:absolute_path" (0.77s)
2023-07-28T00:17:39.5257116Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"unixs:absolute_path" (4.05s)
2023-07-28T00:17:39.5257624Z     --- SKIP: TestAuthority/Size:_1,_Scenario:_"unixs://absolute_path" (0.05s)
2023-07-28T00:17:39.5258131Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"unixs://absolute_path" (4.55s)
2023-07-28T00:17:39.5258641Z     --- PASS: TestAuthority/Size:_1,_Scenario:_"http://domain[:port]" (0.36s)
2023-07-28T00:17:39.5259146Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"http://domain[:port]" (3.53s)
2023-07-28T00:17:39.5259719Z     --- SKIP: TestAuthority/Size:_1,_Scenario:_"http://address[:port]" (0.05s)
2023-07-28T00:17:39.5260217Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"http://address[:port]" (1.77s)
2023-07-28T00:17:39.5261198Z     --- SKIP: TestAuthority/Size:_1,_Scenario:_"https://domain[:port]_insecure" (0.05s)
2023-07-28T00:17:39.5421991Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"https://domain[:port]_insecure" (4.51s)
2023-07-28T00:17:39.5422977Z     --- PASS: TestAuthority/Size:_1,_Scenario:_"https://address[:port]_insecure" (1.12s)
2023-07-28T00:17:39.5424174Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"https://address[:port]_insecure" (4.44s)
2023-07-28T00:17:39.5424824Z     --- PASS: TestAuthority/Size:_1,_Scenario:_"https://domain[:port]" (0.87s)
2023-07-28T00:17:39.5425348Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"https://domain[:port]" (2.46s)
2023-07-28T00:17:39.5425890Z     --- SKIP: TestAuthority/Size:_1,_Scenario:_"https://address[:port]" (0.05s)
2023-07-28T00:17:39.5426435Z     --- FAIL: TestAuthority/Size:_3,_Scenario:_"https://address[:port]" (2.88s)

@chaochn47
Copy link
Member Author

$ stress -p=1 ./e2e.test -test.run TestAuthority

...
11m10s: 16 runs so far, 0 failures

@chaochn47
Copy link
Member Author

chaochn47 commented Jul 28, 2023

Ping @ahrtr @serathius @jmhbnz @fuweid

Please review commit by commit. First one is just updating gRPC version. Second one is updating resolver.Addr.ServerName with Host:port and authority test.

@ahrtr
Copy link
Member

ahrtr commented Jul 28, 2023

We either:

  1. set the serverName to Host:Port, just as this PR does;
  2. Or keep it's as it's (set ServerName to only Hostname), but update the test case, just as dependency: bump google.golang.org/grpc from 1.51.0 to 1.52.0 #15131 did.

Both solutions seem work, and I do not see it will break anything with either solution. But based on @dfawley's comment in grpc/grpc-go#5748 (comment), it seems that he likes option 2 above (setting ServerName to the host only) :)

@ahrtr
Copy link
Member

ahrtr commented Jul 28, 2023

Also I suggest to bump gRPC to a more latest version, e.g. 1.56.2 or 1.57.0.

@chaochn47
Copy link
Member Author

No, authority header value should contains port. It is based on RFC. https://www.rfc-editor.org/rfc/rfc3986#section-3.2.3

@ahrtr
Copy link
Member

ahrtr commented Jul 28, 2023

No, authority header value should contains port. It is based on RFC. https://www.rfc-editor.org/rfc/rfc3986#section-3.2.3

port is optional.

authority = [ userinfo "@" ] host [ ":" port ]

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.
Maybe we can consider changing the title with Fix http2 authority header in multiple endpoints scenario

tests/integration/grpc_test.go Show resolved Hide resolved
tests/e2e/ctl_v3_grpc_test.go Outdated Show resolved Hide resolved
@fuweid
Copy link
Member

fuweid commented Jul 28, 2023

set the serverName to Host:Port, just as this PR does;

I vote for this. If the endpoint is just ip:port, and according to RFC, it's valid to have port here. It doesn't look correct if we just have IP here.

@dfawley
Copy link

dfawley commented Jul 28, 2023

But based on @dfawley's comment in grpc/grpc-go#5748 (comment), it seems that he likes option 2 above (setting ServerName to the host only)

I'm not sure about the details in the test you're referring to, but by default, gRPC will use the full "endpoint" portion of the dial target as the authority (ref). It's very common to dial host:port, so we will use a port quite often.

Also I suggest to bump gRPC to a more latest version, e.g. 1.56.2 or 1.57.0.

This is important. If you're using experimental APIs and your users upgrade to the latest version (or if one of their dependencies does), then you will break them if you are not keeping up with recent and upcoming changes, e.g. grpc/grpc-go#6363, grpc/grpc-go#6451, and the changes announced here grpc/grpc-go#6472. (I.e. #15145)

@ahrtr
Copy link
Member

ahrtr commented Jul 28, 2023

thx @dfawley for the feedback.

Then let's upgrade to 1.52 for now, and will take a closer look at the links you mentioned later.

@chaochn47
Copy link
Member Author

If you're using experimental APIs and your users upgrade to the latest version (or if one of their dependencies does), then you will break them if you are not keeping up with recent and upcoming changes

Thanks for calling it out. It's indeed a risk. ref. https://go.dev/ref/mod#minimal-version-selection

@chaochn47 chaochn47 changed the title Bump up grpc from v1.51.0 to v1.52.0 Fix http2 authority header in multiple endpoints scenario and bump up grpc from v1.51.0 to v1.52.0 Jul 28, 2023
@chaochn47
Copy link
Member Author

chaochn47 commented Jul 29, 2023

port is optional.
authority = [ userinfo "@" ] host [ ":" port ]

@ahrtr

Just as the original 3.5.0 issue #13192 stated, the use case is putting etcd server behinds proxy.

With that in mind, etcd client (based on gRPC client support on proxy) first needs to establish connection with the proxy with CONNECT method. https://www.rfc-editor.org/rfc/rfc7540#section-8.3

The proxy server needs to know which etcd server it should connect to from the authority header. So that's why port should be included.

However, there is an exception that if etcd server is configured to listen on default port of the scheme (:443 on https and :80 on http). It is descried in https://www.rfc-editor.org/rfc/rfc3986#section-3.2.3 and the proxy server should support it.

/cc @menghanl for advice.

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid
Copy link
Member

fuweid commented Jul 30, 2023

For e2e test case, the etcdctl check perf can sends request to all the endpoints. But it needs 60s to finish the performance at least. Maybe we can introduce --timeout for it and we don't need to care the error.

@serathius serathius merged commit 9126a0f into etcd-io:main Jul 31, 2023
@chaochn47 chaochn47 deleted the bump-up-gRPC branch July 31, 2023 17:02
chaochn47 added a commit to chaochn47/etcd that referenced this pull request Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants