Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kuma-cp): kds deadlock #5373

Merged

Conversation

jakubdyszkiewicz
Copy link
Contributor

@jakubdyszkiewicz jakubdyszkiewicz commented Nov 24, 2022

The Context

KDS uses one gRPC bi-directional stream initiated from the client (zone CP) to the server (global CP). Then, under the hood, it creates two logical streams similar to xDS (DiscoveryRequest/Response etc.). We call this multiplexing. As a result, we can reuse parts of go-control-plane to have similar functionality to xDS.

gRPC is based on HTTP/2 therefore it has a functionality called flow control.
The sender sends the data to the receiver and then waits for WINDOW_UPDATE to send more.
With regular HTTP/2, there is a new stream for every request/response message exchange. Only one side is expected to send DATA frames (only frames affected by flow controls) at any given time.

With gRPC streaming it's different. Multiple messages are streamed on one HTTP/2 stream. Moreover, with bi-directional streaming, there are messages exchanged both ways at the same time.

The Problem

When Zone CP connects to Global CP it sends DiscoveryRequest for resources synced from Global to Zone CP (policies mostly). In parallel, it waits for DiscoveryRequest in order to send DiscoveryResponse of resources synced from Zone to Global (Dataplane, DataplaneInsight etc.).

Global CP may currently sending mux messages while not actively receiving mux messages from the Zone CP. Zone CP may be in the same situation.

If both are stuck on send operation, we are in deadlock. We cannot have a pattern of: send and wait until it's done, then receive a message.

When we look at a high-level logical KDS stream, we have such a pattern on the client side and there is a similar pattern on the server side.

However when we operate on a low-level multiplexed gRPC stream we do have separate goroutines that in parallel try to send and receive mux messages.

The problem is that we can send and receive at most two messages before we are stuck on processing those messages.

In the case of a high burst of big messages, we can actually end up in a deadlock. We do have gRPC keep alive, but it uses PING frames which are not throttled by flow control.

The solution

The solution is to break the "send and wait until it's done, then receive a message." pattern.

To do this we can simply increase buffers in the buffer stream and make Send() not wait until it's actually sent.

How do we pick buffer size

If we set a big buffer size, wouldn't we lose backpressure (= flow control)? The answer is no, because backpressure is controlled on the application layer already. A client sends ACK/NACK to signal that it wants to receive a new version of resources. The server never sends multiple DiscoveryResponses of the same resource type in a row. Same on the client side, the client never sends multiple DiscovyRequests of the same resource type in a row.

This means that we can have at most number of synced resources from one cp to another in one direction of a buffered stream.
For convenience, we set the buffer to the number of resource types in a system leaving a small place for potential DiscoveryRequests for resources that we don't know of (does not happen now but might with future migrations?)

Don't we need to wait for Send to complete to ask for another resource?

We don't. Even when we did wait for Send() it did not mean that anything was sent. Send() can return, but the message can be stuck on different levels (network, TCP buffers, anything).

Deadlock prevention

While multiple tests showed that the fix solves the problem, I introduced an additional mechanism for deadlock prevention - send message timeout. If by any chance we are stuck on Send() for longer than X seconds, we terminate the stream.
By default, the timeout is 1 minute and it is configurable if there is such a need.

Testing

It's quite difficult and expensive to run E2E for a such case as a part of our current CI.
I managed to reproduce this problem by creating 1000s of resources on both sides and restarting Zone CP XX times until I hit the issue.

Checklist prior to review

  • Link to docs PR or issue -- no docs
  • Link to UI issue or PR --
  • Is the issue worked on linked? -- KDS instability #5268
  • The PR does not hardcode values that might break projects that depend on kuma (e.g. "kumahq" as a image registry) --
  • The PR will work for both Linux and Windows, system specific functions like syscall.Mkfifo have equivalent implementation on the other OS --
  • Unit Tests -- tests on timeout only
  • E2E Tests --
  • Manual Universal Tests --
  • Manual Kubernetes Tests --
  • Do you need to update UPGRADE.md? --
  • Does it need to be backported according to the backporting policy? --
  • Do you need to explicitly set a > Changelog: entry here or add a ci/ label to run fewer/more tests?

Signed-off-by: Jakub Dyszkiewicz <[email protected]>
@jakubdyszkiewicz jakubdyszkiewicz changed the title fix(kuma-cp): increase KDS message buffers fix(kuma-cp): KDS deadlock Nov 30, 2022
@jakubdyszkiewicz jakubdyszkiewicz changed the title fix(kuma-cp): KDS deadlock fix(kuma-cp): kds deadlock Nov 30, 2022
@jakubdyszkiewicz jakubdyszkiewicz marked this pull request as ready for review November 30, 2022 17:12
@jakubdyszkiewicz jakubdyszkiewicz requested a review from a team as a code owner November 30, 2022 17:12
Copy link
Contributor

@lukidzi lukidzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work and explanation!🎉

Co-authored-by: Charly Molter <[email protected]>
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
@lahabana
Copy link
Contributor

lahabana commented Dec 1, 2022

@Mergifyio backport release-2.0 release-1.8 release-1.7 release-1.6 release-1.5

@mergify
Copy link
Contributor

mergify bot commented Dec 1, 2022

backport release-2.0 release-1.8 release-1.7 release-1.6 release-1.5

✅ Backports have been created

@jakubdyszkiewicz jakubdyszkiewicz enabled auto-merge (squash) December 1, 2022 12:59
@jakubdyszkiewicz jakubdyszkiewicz merged commit 3857789 into kumahq:master Dec 1, 2022
mergify bot pushed a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
(cherry picked from commit 3857789)

# Conflicts:
#	pkg/config/multizone/kds.go
mergify bot pushed a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
(cherry picked from commit 3857789)

# Conflicts:
#	pkg/config/multizone/kds.go
#	pkg/config/multizone/multicluster.go
mergify bot pushed a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
(cherry picked from commit 3857789)

# Conflicts:
#	pkg/config/multizone/kds.go
#	pkg/config/multizone/multicluster.go
mergify bot pushed a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
(cherry picked from commit 3857789)

# Conflicts:
#	pkg/config/multizone/kds.go
#	pkg/config/multizone/multicluster.go
mergify bot pushed a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
(cherry picked from commit 3857789)

# Conflicts:
#	pkg/config/multizone/kds.go
#	pkg/config/multizone/multicluster.go
jakubdyszkiewicz added a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
jakubdyszkiewicz pushed a commit that referenced this pull request Dec 1, 2022
jakubdyszkiewicz added a commit that referenced this pull request Dec 1, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
jakubdyszkiewicz pushed a commit that referenced this pull request Dec 2, 2022
jakubdyszkiewicz added a commit that referenced this pull request Dec 2, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
jakubdyszkiewicz added a commit that referenced this pull request Dec 2, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
jakubdyszkiewicz added a commit that referenced this pull request Dec 2, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
jakubdyszkiewicz added a commit that referenced this pull request Dec 2, 2022
Signed-off-by: Jakub Dyszkiewicz <[email protected]>
jakubdyszkiewicz pushed a commit that referenced this pull request Dec 2, 2022
jakubdyszkiewicz pushed a commit that referenced this pull request Dec 2, 2022
jakubdyszkiewicz pushed a commit that referenced this pull request Dec 2, 2022
@lahabana lahabana mentioned this pull request Dec 13, 2022
@Icarus9913 Icarus9913 mentioned this pull request Jul 29, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants