Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: question about the difference between RaftCommandSize and RaftMaxUncommittedEntriesSize #57358

Closed
benlatern opened this issue Dec 2, 2020 · 4 comments
Assignees
Labels
A-kv-replication Relating to Raft, consensus, and coordination. O-community Originated from the community X-blathers-triaged blathers was able to find an owner

Comments

@benlatern
Copy link
Contributor

benlatern commented Dec 2, 2020

/pkg/kv/kvserver/replica_raft.go line 155
`quotaSize := uint64(proposal.command.Size())

if maxSize := uint64(MaxCommandSize.Get(&r.store.cfg.Settings.SV)); quotaSize > maxSize {

	return nil, nil, 0, roachpb.NewError(errors.Errorf(

		"command is too large: %d bytes (max: %d)", quotaSize, maxSize,

	))

}`

/vendor/etcd/raft/log_unstable.go line 1719
`func (r *raft) increaseUncommittedSize(ents []pb.Entry) bool {

var s uint64

for _, e := range ents {

	s += uint64(PayloadSize(e))

}

if r.uncommittedSize > 0 && r.uncommittedSize+s > *r.maxUncommittedSize {

	// If the uncommitted tail of the Raft log is empty, allow any size

	// proposal. Otherwise, limit the size of the uncommitted tail of the

	// log and drop any proposal that would push the size over the limit.

	return false

}

r.uncommittedSize += s

return true

}`
In my actual test, when sending a large size batchrequest(size<kv.raft.command.max_size), it can pass the detection of line 155 in replica_raft, but after entering the increaseUncommittedSize method, if r.uncommittedSize> 0, the batch is larger than maxUncommittedSize, it will be dropped, but the return value is only false and there is no error message. For the user, he does not receive any error message, but in fact the system at this time is no longer operating normally. When I open adminUI, all The monitoring metrics data cannot be loaded.
Can the size limit here be unified?

@blathers-crl
Copy link

blathers-crl bot commented Dec 2, 2020

Hello, I am Blathers. I am here to help you get the issue triaged.

It looks like you have not filled out the issue in the format of any of our templates. To best assist you, we advise you to use one of these templates.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Dec 2, 2020
@nvanbenschoten
Copy link
Member

Hi @benlatern, do you mind explaining the symptoms of this? The uncommittedSize isn't dropping the entry indefinitely, it's just delaying the proposal to prevent the Raft window size from being badly exceeded. The proposal should eventually be reproposed.

@nvanbenschoten nvanbenschoten changed the title About system errors caused by the difference between RaftCommandSize and UnCommittedSize kv: question about the difference between RaftCommandSize and RaftMaxUncommittedEntriesSize Dec 3, 2020
@nvanbenschoten nvanbenschoten added the A-kv-replication Relating to Raft, consensus, and coordination. label Dec 3, 2020
@nvanbenschoten nvanbenschoten self-assigned this Dec 3, 2020
@benlatern
Copy link
Contributor Author

/pkg/kv/kvserver/replica_raft.go#L379
` if errors.Is(err, raft.ErrProposalDropped) {
// A proposal was forwarded to this replica but we couldn't propose it.
// Swallow the error since we don't have an effective way of signaling
// this to the sender.
// TODO(bdarnell): Handle ErrProposalDropped better.
// #21849

err = nil`
Just set err to nil and no other processing is done. Does this cause the request to be retried all the time? Is it a better choice to set a limit on the number of retries or other restrictions?

@nvanbenschoten
Copy link
Member

Does this cause the request to be retried all the time?

Yes, this causes the proposal to be retried periodically until it succeeds.

Is it a better choice to set a limit on the number of retries or other restrictions?

That decision is left up to the client. If they have a timeout set, they can try to cancel the proposal here.

I'm going to close this since it doesn't look like there's an actual issue here. But feel free to continue the conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

No branches or pull requests

2 participants