Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udp: add new key based hash policy #15967

Merged
merged 13 commits into from
May 3, 2021

Conversation

davidkornel
Copy link
Contributor

Commit Message:
This feature adds a new hash policy to the udp_proxy_filter. Previously there was only one hash policy supported. This patch introduces a new field called 'key'. Envoy users can set this field while configuring and hash based load balancing algorithms
can choose an upstream host based on the value of this new field.

SourceIp based hash policy cannot route multiple clients to the same upstream host. With key based hash policy you can do this easily.

Additional Description: None
Risk Level: LOW
Testing: bazel test
Docs Changes: Add a new option called 'key' under UdpProxyConfig's hash_policy.
Release Notes: None

@repokitteh-read-only
Copy link

Hi @davidkornel, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #15967 was opened by davidkornel.

see: more, trace.

@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #15967 was opened by davidkornel.

see: more, trace.

@rojkov
Copy link
Member

rojkov commented Apr 14, 2021

What's the use case for the new policy? Is it to disable load balancing without touching cluster config?

davidkornel and others added 2 commits April 14, 2021 19:01
Co-authored-by: DongRyeol Cha <[email protected]>
Signed-off-by: Kornel David <[email protected]>
Co-authored-by: DongRyeol Cha <[email protected]>
Signed-off-by: Kornel David <[email protected]>
@mattklein123
Copy link
Member

What's the use case for the new policy? Is it to disable load balancing without touching cluster config?

+1 curious about the use case here?

/wait-any

@davidkornel
Copy link
Contributor Author

What's the use case for the new policy? Is it to disable load balancing without touching cluster config?

As I mentioned in this comment, we want to make envoy handle VOIP calls. In a VOIP call, there are 4 different UDP streams (2 RTP 2 RTCP). The upstream hosts are UDP/RTP proxies, for obvious reasons we want and need to load balance all four streams to the same upstream proxy. We couldn't figure out a better solution than hashing on the call-id. Call-id is unique for each call.

This feature aims to support terminating RTP and RTCP streams at the same endpoint by hashing on a given session-id/call-id.

@mattklein123
Copy link
Member

But what is the point of having a fixed hash key that will always hash to effectively a single host? Why not just have a single host in the cluster? Don't you really want to hash some bytes in the payload that represents the call id?

@davidkornel
Copy link
Contributor Author

But what is the point of having a fixed hash key that will always hash to effectively a single host? Why not just have a single host in the cluster? Don't you really want to hash some bytes in the payload that represents the call id?

So on the VoIP media plane, every single voice call is represented by 4 district UDP streams: a UDP/RTP stream that carries the actual voice samples and a UDP/RTCP stream carrying call quality statistics for the caller->callee direction, and the same for the reverse direction. Each UDP stream will have its own downstream port and so its own UDP listener and its own cluster configured in Envoy, and we need to load-balance all four UDP streams that belong to a single call to the same upstream host (an actual RTP proxy like RTPengine) otherwise things break. Our solution is to set the same key in all four UDP listeners/clusters and then rely on the new hash policy implemented in this PR to make sure all UDP streams are hashed to the same upstream host.
Now another VoIP call will have its own (different) key configured for its 4 UDP listeners/clusters, so the corresponding UDP streams will most probably land at another upstream host. This way we get load balancing at the level of distinct calls, and not at the level of individual UDP streams. We checked this and with this PR things seem to work fine.

But wait, there's more. If the upstream host becomes unhealthy our call will need to be rerouted to another host(there are more than one upstream endpoints). Since we still have the same fix hash key set for each UDP stream this PR also makes sure that the reconnected UDP streams again land at the same upstream once Envoy removes the unhealthy host from the endpoint list.

Unfortunately, we cannot rely on any information in the actual payloads for load balancing, since there are no stable hash keys carried by RTP/RTCP headers (call ids are negotiated in the control plane) across the 4 UDP streams.
I guess the confusion originates in that UDP semantics are so much different from TCP/HTTP: while a TCP listener is multiplexing an arbitrary number of downstream hosts, a UDP listener is dedicated to a single downstream host in VoIP. So we'll have 4x as many UDP listeners configured in Envoy as there are calls, but this is just the way VoIP was defined to work...

@mattklein123
Copy link
Member

Unfortunately, we cannot rely on any information in the actual payloads for load balancing, since there are no stable hash keys carried by RTP/RTCP headers (call ids are negotiated in the control plane) across the 4 UDP streams.
I guess the confusion originates in that UDP semantics are so much different from TCP/HTTP: while a TCP listener is multiplexing an arbitrary number of downstream hosts, a UDP listener is dedicated to a single downstream host in VoIP. So we'll have 4x as many UDP listeners configured in Envoy as there are calls, but this is just the way VoIP was defined to work...

This seems utterly insane, but thanks for the length explanation. This makes sense to me. @rojkov ?

Can you fix CI? (note main is currently broken and will be fixed soon)

@davidkornel
Copy link
Contributor Author

Thanks for the quick response.
One of the new tests fails here. I'll look at it tomorrow, interesting part is that it does not fail if I run it locally. Any idea how could this happen?

Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation! It's amazing Envoy is still applicable in such interesting setups. This case is worth to be documented IMHO.

I left a few comments about formatting mostly.

Comment on lines 37 to 38
// A given key will be used to compute the hash used by hash-based load balancing algorithms.
string key = 2 [(validate.rules).string = {min_len: 1}];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this doc string needs to be extended to reference the VoIP case and to mention that listeners (unlike clusters?) are created dynamically by control plane for every single call (am I right here?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll rewrite this, and mentioning the VoIP case is a good idea. Should I tag/mention this PR in that for further information for future users?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd better make it self sufficient, because this string will go to the user docs.


absl::optional<uint64_t>
evaluate(const Network::Address::Instance& downstream_addr) const override {
(void)downstream_addr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it with

 UNREFERENCED_PARAMETER(downstream_addr);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it! Thanks.

class KeyHashMethod : public HashPolicyImpl::HashMethod {
public:
KeyHashMethod(const std::string& key) : Key(key) {}
const std::string& Key;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this member to the private section and rename it to key_ to conform the coding style.

Copy link
Contributor Author

@davidkornel davidkornel Apr 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it! Thanks.

@@ -20,13 +20,34 @@ class SourceIpHashMethod : public HashPolicyImpl::HashMethod {
}
};

class KeyHashMethod : public HashPolicyImpl::HashMethod {
public:
KeyHashMethod(const std::string& key) : Key(key) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark it explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thanks.

Comment on lines 33 to 34
uint64_t hash = HashUtil::xxHash64(Key);
return hash;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just return the result of HashUtil::xxHash64(Key) for brevity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's better. Thanks.

Comment on lines 61 to 62
const std::string Key;
const std::string EmptyKey;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better make the names conform the style: key_ and empty_key_.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it. Thanks.

Signed-off-by: Kornel David <[email protected]>
Comment on lines 37 to 38
// A given key will be used to compute the hash used by hash-based load balancing algorithms.
string key = 2 [(validate.rules).string = {min_len: 1}];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd better make it self sufficient, because this string will go to the user docs.

}

private:
const std::string& key_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure the string this member is referencing doesn't go out of scope. This may be the reason why the CI test if failing.
And speaking of the string... Wouldn't it be more efficient to calculate the hash only once upon the hash method's instantiation if key is const anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure the string this member is referencing doesn't go out of scope. This may be the reason why the CI test is failing.

It was the problem, thanks for the hint.

And speaking of the string... Wouldn't it be more efficient to calculate the hash only once upon the hash method's instantiation if key is const anyway?

I changed the KeyHashMethod class as you suggested. I have a question. According to the doc string, key must have a minimum length of one. I'm not sure if I should check whether key is empty or not in the constructor (explicit KeyHashMethod(const std::string& key) { hash_ = HashUtil::xxHash64(key); }). I feel like I should, but as I see it would be unnecessary. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's not necessary indeed - protobuf will validate the value. But it won't harm to ASSERT() the key is not empty to make sure the tests are properly set up.

Since HashUtil::xxHash64() handles empty inputs gracefully you can move hash_ initialization to the init list, e.g.:

explicit KeyHashMethod(const std::string& key) : hash_{HashUtil::xxHash64(key)} { ASSERT(!key.empty()); }

Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM modulo one comment is addressed.

}

private:
uint64_t hash_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I forgot. Thanks.

Signed-off-by: Kornel David <[email protected]>
@davidkornel
Copy link
Contributor Author

@rojkov I've made all the changes you requested. Is there anything else you think should be changed? Let me know.

Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think it's ready for a second pass @mattklein123

@davidkornel
Copy link
Contributor Author

@rojkov Could you review? Matt assigned you, I guess it's on you now.

rojkov
rojkov previously approved these changes Apr 22, 2021
Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only restate my approval as a first pass reviewer.


// A given key will be used to compute the hash used by hash-based load balancing algorithms.
// In certain cases there is a need to direct different UDP streams jointly towards the selected set of endpoints.
// A possible use-case is VoIP telephony,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The formatting is funny though. I guess no need to put a new line after every punctuation mark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed, it was indeed very interesting looking. I know there's no need for that, but back then I thought it will be good. I just wanted to have shorter lines but ended up looking funny.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rojkov BTW I guess you must approve one more time due to my latest changes. After your approval, Matt has to approve as well, right?

Signed-off-by: Kornel David <[email protected]>
rojkov
rojkov previously approved these changes Apr 23, 2021
Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restating approval

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small doc request, thanks!

/wait

// A possible use-case is VoIP telephony, where media (RTP) and its corresponding control (RTCP) belong to the same logical session,
// although they travel in separate streams. To ensure that these pair of streams are load-balanced on session level
// (instead of individual stream level), dynamically created listeners can use the same hash key for each stream in the session.
string key = 2 [(validate.rules).string = {min_len: 1}];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a release note for this feature: https://github.com/envoyproxy/envoy/blob/main/docs/root/version_history/current.rst (with link to this new field).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it, not so sure if it's OK or not, don't have much experience writing release notes. I tried to follow the previous ones.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 merged commit a058c6e into envoyproxy:main May 3, 2021
gokulnair pushed a commit to gokulnair/envoy that referenced this pull request May 5, 2021
Signed-off-by: Kornel David <[email protected]>
Signed-off-by: Gokul Nair <[email protected]>
gokulnair pushed a commit to gokulnair/envoy that referenced this pull request May 6, 2021
Signed-off-by: Kornel David <[email protected]>
Signed-off-by: Gokul Nair <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants