Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: apply a limit to outgoing raft msg batching #71132

Merged
merged 2 commits into from
Oct 20, 2021

Conversation

tbg
Copy link
Member

@tbg tbg commented Oct 5, 2021

In #71050, we saw evidence of very large (2.3+GiB) Raft messages being
sent on the stream, which overwhelmed both the sender and the receiver.
Raft messages are batched up before sending and so what must have
happened here is that a large number of reasonably-sized messages (up to
64MiB in this case due to the max_size setting) were merged into a giant
blob. As of this commit, we apply the max_size chunking on the batching
step before sending messages as well.

Closes #71050.

Release note: None

@tbg tbg requested a review from a team as a code owner October 5, 2021 13:16
@tbg tbg requested a review from erikgrinaker October 5, 2021 13:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member Author

tbg commented Oct 5, 2021

Still need to write a test, which... I feel like that will require me to do a lot of refactoring that I don't want to do (backports), so it will be in a separate commit.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)


pkg/kv/kvserver/raft_transport.go, line 494 at r1 (raw file):

			// Pull off as many queued requests as possible.
			//
			// TODO(peter): Think about limiting the size of the batch we send.

Feels good to resolve a five-year-old TODO!


pkg/kv/kvserver/raft_transport.go, line 490 at r2 (raw file):

			return err
		case req := <-ch:
			maxBytes := MaxCommandSize.Get(&t.st.SV)

You may want to note that this doesn't actually have anything to do with limiting the size of a command, we're just applying the same limit to something semi-related. We could even have a separate setting for it, but I think this is reasonable.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @tbg)


pkg/kv/kvserver/raft_transport.go, line 510 at r1 (raw file):

			}

			// Reuse the Requests slice, but zero out the contents to avoid delaying

Nice spot.


pkg/kv/kvserver/raft_transport.go, line 490 at r2 (raw file):

			return err
		case req := <-ch:
			maxBytes := MaxCommandSize.Get(&t.st.SV)

Should we be accounting for this first req?


pkg/kv/kvserver/raft_transport.go, line 490 at r2 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

You may want to note that this doesn't actually have anything to do with limiting the size of a command, we're just applying the same limit to something semi-related. We could even have a separate setting for it, but I think this is reasonable.

I think a separate setting is wise. We may want to tune these knobs separately and this has a very different meaning than MaxCommandSize. It also has a different meaning than a lot of the per-range limits we pass to etcd/raft because this limit is above the level at which we multiplex multiple ranges onto the same transport.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)


pkg/kv/kvserver/raft_transport.go, line 490 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I think a separate setting is wise. We may want to tune these knobs separately and this has a very different meaning than MaxCommandSize. It also has a different meaning than a lot of the per-range limits we pass to etcd/raft because this limit is above the level at which we multiplex multiple ranges onto the same transport.

Yeah, I also know of users who run with this setting set very large (several GB), because they got burnt when we accidentally generated commands that were too large. The occasional huge command is probably fine, but shipping requests that large around semi-regularly maybe isn't.

@tbg
Copy link
Member Author

tbg commented Oct 13, 2021

Sounds good, will ping this PR once I've addressed these comments after having caught back up from PTO.

Standard Go error - we were trying to avoid allocations by recycling a
slice but weren't zeroing it out before. The occasional long slice that
would reference a ton of memory would then effectively keep that large
amount of memory alive forever.

Touches cockroachdb#71050.

Release note: None
@tbg tbg force-pushed the fix-giant-raft-msgs branch from 9ccf33e to 3fc9c27 Compare October 19, 2021 12:14
@tbg
Copy link
Member Author

tbg commented Oct 19, 2021

Updated, PTAL

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)


pkg/kv/kvserver/raft_transport.go, line 59 at r4 (raw file):

// maxRaftOutgoingBatchSize wraps "kv.raft.command.max_batch_size".
var maxRaftOutgoingBatchSize = settings.RegisterByteSizeSetting(
	"kv.raft.command.max_batch_size",

This should be target_batch_size, not max, since we generally overshoot it.

@tbg tbg force-pushed the fix-giant-raft-msgs branch from 3fc9c27 to 7f9adeb Compare October 19, 2021 19:38
@tbg tbg changed the title kvserver: apply kv.raft.command.max_size to outgoing raft msg batching kvserver: apply a limit to outgoing raft msg batching Oct 19, 2021
@tbg tbg requested a review from erikgrinaker October 19, 2021 19:44
Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Not sure if you wanted to re-approve this, but if it looks good you might as well bors it for me.

Dismissed @erikgrinaker from 2 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @tbg)


pkg/kv/kvserver/raft_transport.go, line 59 at r4 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

This should be target_batch_size, not max, since we generally overshoot it.

Done.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)


-- commits, line 21 at r5:
nit: s/traget/target/


pkg/kv/kvserver/raft_transport.go, line 505 at r5 (raw file):

			return err
		case req := <-ch:
			maxBytes := targetRaftOutgoingBatchSize.Get(&t.st.SV) - int64(req.Size())

nit: may want to rename maxBytes as well.

… msg batching

In cockroachdb#71050, we saw evidence of very large (2.3+GiB) Raft messages being
sent on the stream, which overwhelmed both the sender and the receiver.
Raft messages are batched up before sending and so what must have
happened here is that a large number of reasonably-sized messages (up to
64MiB in this case due to the max_size setting) were merged into a giant
blob. As of this commit, we apply the target-size chunking on the batching
step before sending messages as well.

Closes cockroachdb#71050.

Release note: None
@tbg tbg force-pushed the fix-giant-raft-msgs branch from 7f9adeb to 1d959cb Compare October 19, 2021 19:58
@erikgrinaker
Copy link
Contributor

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Oct 20, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: import/tpcc/warehouses=4000/geo failed
4 participants