-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvclient,kvserver: update client range cache on all RPCs #51437
kvclient,kvserver: update client range cache on all RPCs #51437
Conversation
The first commit is #51388 |
738655e
to
018f7c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like what I'm seeing here but I also don't feel very confident to review the follower read unit test, which I think is the most important bit. I hope you can find another reviewer who can provide more confidence.
Reviewed 4 of 9 files at r2, 5 of 5 files at r3, 7 of 7 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 235 at r4 (raw file):
WithStatementTrace: func(sp opentracing.Span, stmt string) { if stmt == historicalQuery { recCh <- tracing.GetRecording(sp)
Are you sure this runs just once? It seems to me that every query will get here. Maybe you need a boolean to conditionally write to the channel?
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 253 at r4 (raw file):
// Speed up closing of timestamps, as we'll in order to be able to use // experimental_follower_read_timestamp(). // n1.Exec(t, `SET CLUSTER SETTING kv.closed_timestamp.target_duration = '0.3s'`)
Are you sure you want to keep this commented code?
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 291 at r4 (raw file):
// Execute the query again and assert the cache is updated. This query will // not be executed as a follower read since it'll attempt to use n2 which
nit: the contractions here don't read well. I'd spell "will" in full. (In fact, I think present tense throughout here works just as well and is even easier on the eye.)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 1890 at r3 (raw file):
if len(br.RangeInfos) > 0 { routing.EvictAndReplace(ctx, br.RangeInfos...) routing = EvictionToken{}
curious: why do you terminate with an EvictionToken here?
pkg/kv/kvserver/replica_follower_read.go, line 41 at r4 (raw file):
func (r *Replica) canServeFollowerRead( ctx context.Context, ba *roachpb.BatchRequest, pErr *roachpb.Error, ) (_pErr *roachpb.Error) {
I don't think you need to keep the identifier _pErr
here. If you keep it, I suggest a different name which doesn't start with an underscore.
pkg/kv/kvserver/replica_send.go, line 130 at r3 (raw file):
// on errors because the code doesn't currently support returning both a br // and a pErr here. Also, some errors (e.g. NotLeaseholderError) have custom // ways of returning range info.
The comment is telling me that I should see some code elsewhere that peeks into a NotLeaseholderError and updates RangeInfos accordingly.
Can you remind me where that code is?
The situation that interests me is when we have code that receives a pErr, looks at it, then performs another request as part of the error handling, and then discards the error.
Is the RangeInfos from the pErr forwarded/attached in the response for the error handling?
pkg/roachpb/api.proto, line 2002 at r3 (raw file):
message ClientRangeInfo { int64 descriptor_generation = 1;
Can you add some comments here to explain.
pkg/util/syncutil/atomic.go, line 77 at r4 (raw file):
// Get atomically returns the current value. func (s *AtomicString) Get() string { s.mu.Lock()
RLock() maybe?
pkg/util/tracing/test_utils.go, line 46 at r4 (raw file):
if sp.Operation == "/cockroach.roachpb.Internal/Batch" && sp.Tags["span.kind"] == "server" { if logsContainMsg(sp.Logs, "serving via follower read") {
This comparison is weak and prone to return both false positives and false negatives.
I would recommend at the very least the following:
-
to take care of the false negatives:
make the message a string constant in a common package dependency, and use it in both places. -
to take care of the false positives:
make this an exact comparison and not a substring comparison. The other side haslog.Event(ctx, "serving via follower read")
after all.
(The problem is that the day someone puts the SQL stringservring via follower read
in their index keys, that will show up in theexecuting
trace messages and this condition here will do boo-boo.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 9 files at r2, 5 of 5 files at r3, 7 of 7 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @knz)
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 197 at r4 (raw file):
// test checking that the cache on the gateway gets updated by the first request // encountering this situation, and then follower reads work. func TestFollowerReadsWithStaleDescriptor(t *testing.T) {
The test looks good!
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 275 at r4 (raw file):
var tableID uint32 n1.QueryRow(t, `SELECT id from system.namespace2 WHERE name='test'`).Scan(&tableID) codec := keys.MakeSQLCodec(roachpb.SystemTenantID)
keys.SystemSQLCodec
pkg/kv/kvclient/kvcoord/dist_sender.go, line 1890 at r3 (raw file):
Previously, knz (kena) wrote…
curious: why do you terminate with an EvictionToken here?
+1
pkg/kv/kvclient/kvcoord/dist_sender_test.go, line 795 at r3 (raw file):
switch ba.Replica.StoreID { case 1: assert.Equal(t, lease1.Sequence, ba.ClientRangeInfo.LeaseSequence)
Do you want to assert the range gen here too? And below?
pkg/kv/kvclient/kvcoord/dist_sender_test.go, line 1589 at r3 (raw file):
batchReply.Add(reply) // Return an updated descriptor. batchReply.RangeInfos = []roachpb.RangeInfo{{Desc: descUpdated}}
Is it worth adding a change to the lease in this test (or another one)?
pkg/kv/kvserver/replica_follower_read.go, line 41 at r4 (raw file):
Previously, knz (kena) wrote…
I don't think you need to keep the identifier
_pErr
here. If you keep it, I suggest a different name which doesn't start with an underscore.
I'm guessing it was for debugging and can now be removed.
pkg/kv/kvserver/replica_send.go, line 131 at r3 (raw file):
// and a pErr here. Also, some errors (e.g. NotLeaseholderError) have custom // ways of returning range info. if pErr == nil {
nit: return early on err to avoid the indent. Or just don't call this on error and then remove the error arg?
pkg/kv/kvserver/replica_send.go, line 157 at r3 (raw file):
ctx context.Context, br *roachpb.BatchResponse, r *Replica, cinfo roachpb.ClientRangeInfo, ) { lease, _ := r.GetLease()
GetDescAndLease
? Wasn't this caller the whole point?
pkg/kv/kvserver/replica_send.go, line 159 at r3 (raw file):
lease, _ := r.GetLease() desc := r.Desc() needInfo := (cinfo.LeaseSequence < lease.Sequence) ||
Leave a comment about what >
means in these cases. I guess it means that we're operating off of a follower and out client has newer information than this follower. Seems like a valid situation, but it's subtle enough to warrant a comment.
pkg/kv/kvserver/replica_send.go, line 164 at r3 (raw file):
return } log.VEventf(ctx, 2, "client had stale range info; returning an update")
2 is probably too high of a verbosity level for this, don't you think? Maybe 3?
pkg/kv/kvserver/replica_test.go, line 12810 at r3 (raw file):
require.Equal(t, []roachpb.RangeInfo{{Desc: desc, Lease: lease}}, br.RangeInfos) // If ClientRangeInfo has the right descriptor generation but no lease,
Want to enumerate and test all cases here?
- desc correct, lease not provided
- desc correct, lease incorrect
- desc correct, lease correct
- desc incorrect, lease not provided
- desc incorrect, lease incorrect
- desc incorrect, lease correct
Some of these might not be possible in practice (e.g. desc incorrect, lease correct
), but it's probably worth handling them as expected anyway.
pkg/roachpb/api.proto, line 1974 at r3 (raw file):
// whether the client's info is up to date and, if it isn't, it will return a // RangeInfo with up-to-date information. This field supersedes return_range_info. ClientRangeInfo client_range_info = 17;
Make a note about what nil vs. empty means.
pkg/roachpb/api.proto, line 2002 at r3 (raw file):
Previously, knz (kena) wrote…
Can you add some comments here to explain.
And a comment on the ClientRangeInfo
type itself?
pkg/roachpb/api.proto, line 2065 at r3 (raw file):
// the DistSender. At that point, the DistSender will no longer need to // combine these - in fact it should clear the field. And then the field // doesn't need to be `repeated` any more - I think the proto encoding
I think the proto encoding allows us to change it to non-repeated.
It does.
pkg/util/syncutil/atomic.go, line 60 at r4 (raw file):
} // AtomicString gives you atomic-style APIs for string.
Strongly 👎. Let's not tempt anyone with actually using this. If something's going to call itself an atomic, it shouldn't use a mutex. They have completely different performance characteristics. For instance, atomic operations (those that can be translated to a single CPU instruction, so everything in sync/atomic
) are guaranteed to be wait-free, this structure is not. An atomic operation will never result in a syscall under heavy load, this structure might.
C++ weakens this a little by making lock-freedom (and by extension, wait-freedom) conditional based on the compiler, CPU architecture, and protected data size (see std::atomic<T>::is_lock_free
), but I don't think it would be wise for us to move in that direction.
If you want atomic access to a string, why not use an atomic.Value
?
pkg/util/tracing/test_utils.go, line 41 at r4 (raw file):
// OnlyFollowerReads looks through all the RPCs and asserts that every single // one resulted in a follower read. Returns false if no RPCs are found. func OnlyFollowerReads(rec Recording) bool {
Does this belong in this package? Can we generalize it?
Noticed while reviewing cockroachdb#51437.
51605: roachpb: strongly type RangeGeneration r=nvanbenschoten a=nvanbenschoten Noticed while reviewing #51437. Co-authored-by: Nathan VanBenschoten <[email protected]>
018f7c0
to
7d22d2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @knz, and @nvanbenschoten)
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 197 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
The test looks good!
👍
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 235 at r4 (raw file):
Previously, knz (kena) wrote…
Are you sure this runs just once? It seems to me that every query will get here. Maybe you need a boolean to conditionally write to the channel?
well I'm testing the stmt
. The filter should run exactly twice.
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 253 at r4 (raw file):
Previously, knz (kena) wrote…
Are you sure you want to keep this commented code?
nope, deleted
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 275 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
keys.SystemSQLCodec
done
pkg/ccl/kvccl/kvfollowerreadsccl/followerreads_test.go, line 291 at r4 (raw file):
Previously, knz (kena) wrote…
nit: the contractions here don't read well. I'd spell "will" in full. (In fact, I think present tense throughout here works just as well and is even easier on the eye.)
done
pkg/kv/kvclient/kvcoord/dist_sender.go, line 1890 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
+1
no good reason; removed
pkg/kv/kvclient/kvcoord/dist_sender_test.go, line 795 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do you want to assert the range gen here too? And below?
done
pkg/kv/kvclient/kvcoord/dist_sender_test.go, line 1589 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is it worth adding a change to the lease in this test (or another one)?
added a few subtests
pkg/kv/kvserver/replica_follower_read.go, line 41 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm guessing it was for debugging and can now be removed.
gone
pkg/kv/kvserver/replica_send.go, line 130 at r3 (raw file):
The comment is telling me that I should see some code elsewhere that peeks into a NotLeaseholderError and updates RangeInfos accordingly.
Can you remind me where that code is?
I think you're looking for this code: https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvclient/kvcoord/dist_sender.go#L1909
The situation that interests me is when we have code that receives a pErr, looks at it, then performs another request as part of the error handling, and then discards the error.
Is the RangeInfos from the pErr forwarded/attached in the response for the error handling?
There's no forwarding from NotLeaseholderError
to the final response. In fact, NotLeaseholderError
doesn't even carry a RangeInfo
exactly; it just carries some lease info (and sometimes not even that) - although I've been meaning to transition it to RangeInfo
. I'm not sure why you think there should be forwarding.
Btw, my plan is to move everybody other than the DistSender
away from using the br.RangeInfos
, and actually have the DistSender
terminate that field.
pkg/kv/kvserver/replica_send.go, line 131 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: return early on err to avoid the indent. Or just don't call this on error and then remove the error arg?
done
pkg/kv/kvserver/replica_send.go, line 157 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
GetDescAndLease
? Wasn't this caller the whole point?
done
pkg/kv/kvserver/replica_send.go, line 159 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Leave a comment about what
>
means in these cases. I guess it means that we're operating off of a follower and out client has newer information than this follower. Seems like a valid situation, but it's subtle enough to warrant a comment.
done
pkg/kv/kvserver/replica_send.go, line 164 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
2 is probably too high of a verbosity level for this, don't you think? Maybe 3?
done
pkg/kv/kvserver/replica_test.go, line 12810 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Want to enumerate and test all cases here?
- desc correct, lease not provided
- desc correct, lease incorrect
- desc correct, lease correct
- desc incorrect, lease not provided
- desc incorrect, lease incorrect
- desc incorrect, lease correct
Some of these might not be possible in practice (e.g.
desc incorrect, lease correct
), but it's probably worth handling them as expected anyway.
done
pkg/roachpb/api.proto, line 1974 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Make a note about what nil vs. empty means.
Done.
pkg/roachpb/api.proto, line 2002 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
And a comment on the
ClientRangeInfo
type itself?
done
pkg/roachpb/api.proto, line 2065 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think the proto encoding allows us to change it to non-repeated.
It does.
done
pkg/util/syncutil/atomic.go, line 60 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Strongly 👎. Let's not tempt anyone with actually using this. If something's going to call itself an atomic, it shouldn't use a mutex. They have completely different performance characteristics. For instance, atomic operations (those that can be translated to a single CPU instruction, so everything in
sync/atomic
) are guaranteed to be wait-free, this structure is not. An atomic operation will never result in a syscall under heavy load, this structure might.C++ weakens this a little by making lock-freedom (and by extension, wait-freedom) conditional based on the compiler, CPU architecture, and protected data size (see
std::atomic<T>::is_lock_free
), but I don't think it would be wise for us to move in that direction.If you want atomic access to a string, why not use an
atomic.Value
?
I've kept the struct but replaced the impl with atomic.Value
pkg/util/syncutil/atomic.go, line 77 at r4 (raw file):
Previously, knz (kena) wrote…
RLock() maybe?
gone
pkg/util/tracing/test_utils.go, line 41 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Does this belong in this package? Can we generalize it?
I've moved it to kv/testutils.go
.
I wouldn't generalize this function...
pkg/util/tracing/test_utils.go, line 46 at r4 (raw file):
to take care of the false negatives:
make the message a string constant in a common package dependency, and use it in both places.
done
The other side has log.Event(ctx, "serving via follower read") after all.
The log message gets some arguments in https://reviewable.io/reviews/cockroachdb/cockroach/51523
Note that the comparison is somewhat tight cause we look only inside a certain kind of span. I think this is good enough for a test_utils
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 15 files at r5, 10 of 10 files at r6.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @knz)
pkg/kv/kvbase/constants.go, line 15 at r6 (raw file):
// FollowerReadServingMsg is a log message that needs to be used for tests in // other packages. var FollowerReadServingMsg = "serving via follower read"
const
?
pkg/kv/kvclient/kvcoord/dist_sender_test.go, line 795 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
done
nit: why is the order different in each?
pkg/kv/kvserver/replica_send.go, line 148 at r5 (raw file):
} } else if ba.ClientRangeInfo != nil { returnRangeInfoIfClientStale(ctx, br, r, *ba.ClientRangeInfo)
Do we want to do this even if ba.ReturnRangeInfo
? I'm not sure if that makes the migration easier or harder.
pkg/kv/kvserver/replica_test.go, line 12810 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
done
Very nice.
pkg/util/syncutil/atomic.go, line 60 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I've kept the struct but replaced the impl with
atomic.Value
Thanks. I'm a little disappointed about the naming in this file (i.e. "Set" vs. "Store" and "Get" vs "Load"), but I'll save that for another time.
7d22d2e
to
2d32b81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @knz, and @nvanbenschoten)
pkg/kv/kvbase/constants.go, line 15 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
const
?
done
pkg/kv/kvclient/kvcoord/dist_sender_test.go, line 795 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: why is the order different in each?
done
pkg/kv/kvserver/replica_send.go, line 148 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we want to do this even if
ba.ReturnRangeInfo
? I'm not sure if that makes the migration easier or harder.
Well if ba.ReturnRangeInfo
is set, then the branch above sets br.RangeInfo
. So what more could we do?
pkg/kv/kvserver/replica_test.go, line 12810 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Very nice.
ack
pkg/util/syncutil/atomic.go, line 60 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Thanks. I'm a little disappointed about the naming in this file (i.e. "Set" vs. "Store" and "Get" vs "Load"), but I'll save that for another time.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @andreimatei, @knz, and @nvanbenschoten)
pkg/kv/kvserver/replica_send.go, line 148 at r5 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Well if
ba.ReturnRangeInfo
is set, then the branch above setsbr.RangeInfo
. So what more could we do?
Oh right, I missed that on this pass through. Carry on.
7b469ea
to
42d577c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 15 files at r5, 1 of 11 files at r7, 14 of 14 files at r8, 9 of 9 files at r10.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @andreimatei, @knz, and @nvanbenschoten)
pkg/util/tracing/test_utils.go, line 46 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
to take care of the false negatives:
make the message a string constant in a common package dependency, and use it in both places.done
The other side has log.Event(ctx, "serving via follower read") after all.
The log message gets some arguments in https://reviewable.io/reviews/cockroachdb/cockroach/51523
Note that the comparison is somewhat tight cause we look only inside a certain kind of span. I think this is good enough for atest_utils
function.
thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @andreimatei, @knz, and @nvanbenschoten)
42d577c
to
8fae44b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @andreimatei, @knz, and @nvanbenschoten)
Merge conflict. |
Before this patch, the client's RangeDescriptorCache was not kept up to date very well. For example, as long as a cached descriptor had the leaseholder in it, that cache entry would not be updated even if other replicas were stale. Besides generally wanting this cache to be as up to date as possible, not keeping track of followers properly is a major problem for routing follower reads requests which would constantly be routed to non-existant replicas only to be rejected with a RangeNotFoundError and then retried until eventually making its way to the leaseholder - which leaseholder is kept up to date better. This patch adds a mechanism for the range cache to be kept up to date by having the client communicate its known descriptor generation and lease sequence in all RPCs (in BatchRequest.Header), and have the server respond with updated info if the client's view is stale. The updates only come on successful RPCs, and they reuse the existing BatchResponse.Header.RangeInfos field. Fixes cockroachdb#44053 Release note (bug fix): CRDB now tracks the location of follower replicas for all the ranges much better than previously, which means that more queries will be successfully served as "follower reads".
8fae44b
to
ee12672
Compare
Release note: None
ee12672
to
2dc6a9b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @andreimatei, @knz, and @nvanbenschoten)
Build succeeded: |
I `stressrace`ed it for ~4K runs and couldn't repro. ``` 4456 runs so far, 0 failures, over 28m5s ``` In all likelihood, this test was fixed by @andreimatei's changes around the range descriptor cache. In particular, cockroachdb#51437 and/or cockroachdb#51395. Release justification: Non production code change Release note: None Fixes cockroachdb#50795
53667: backupccl: fix database resolution for revision_history cluster backups r=dt a=pbardea Previously database descriptor changes were not tracked when performing cluster backups with revision history. Fixes #52392. Release justification: bug fix Release note (bug fix): Database creation/deletion was previously not correctly tracked by revision_history cluster backups. This is now fixed. 53692: sql: fix enums in `pg_catalog.pg_attrdef` r=rohany a=rohany Fixes #53687. This commit ensures that enums are displayed with their logical representation in the catalog table `pg_catalog.pg_attrdef`. Release justification: bug fix Release note: None 53777: kvserver: unskip TestStoreRangeMergeConcurrentRequests r=aayushshah15 a=aayushshah15 I `stressrace`ed it for ~4K runs and couldn't repro. ``` 4456 runs so far, 0 failures, over 28m5s ``` In all likelihood, this test was fixed by @andreimatei's changes around the range descriptor cache. In particular, #51437 and/or #51395. Fixes #50795 Release justification: Non production code change Release note: None Co-authored-by: Paul Bardea <[email protected]> Co-authored-by: Rohan Yadav <[email protected]> Co-authored-by: Aayush Shah <[email protected]>
This commit completes a series of migrations relating to RangeInfo state and the client/server protocol for keeping the range descriptor cache up to date. The migrations were started in cockroachdb#51168, cockroachdb#51378, cockroachdb#51437. Concretely, this commit removes the following cluster versions: - `ClientRangeInfosOnBatchResponse` - `RangeStatsRespHasDesc` It addresses 6 TODOs. And it makes the following changes to the KV APi: - makes `Header.ClientRangeInfo` non-nullable. - deletes `Header.ReturnRangeInfo`. - deletes `ResponseHeader.DeprecatedRangeInfos`. - stops propagating `BatchResponse.RangeInfos` past `DistSender`. - makes RangeStatsResponse.RangeInfo non-nullable.
This commit completes a series of migrations relating to RangeInfo state and the client/server protocol for keeping the range descriptor cache up to date. The migrations were started in cockroachdb#51168, cockroachdb#51378, cockroachdb#51437. Concretely, this commit removes the following cluster versions: - `ClientRangeInfosOnBatchResponse` - `RangeStatsRespHasDesc` It addresses 6 TODOs. And it makes the following changes to the KV APi: - makes `Header.ClientRangeInfo` non-nullable. - deletes `Header.ReturnRangeInfo`. - deletes `ResponseHeader.DeprecatedRangeInfos`. - stops propagating `BatchResponse.RangeInfos` past `DistSender`. - makes RangeStatsResponse.RangeInfo non-nullable.
59395: kv: complete client range info migrations r=nvanbenschoten a=nvanbenschoten This commit completes a series of migrations relating to RangeInfo state and the client/server protocol for keeping the range descriptor cache up to date. The migrations were started in #51168, #51378, #51437. Concretely, this commit removes the following cluster versions: - `ClientRangeInfosOnBatchResponse` - `RangeStatsRespHasDesc` It addresses 6 TODOs. And it makes the following changes to the KV APi: - makes `Header.ClientRangeInfo` non-nullable. - deletes `Header.ReturnRangeInfo`. - deletes `ResponseHeader.DeprecatedRangeInfos`. - stops propagating `BatchResponse.RangeInfos` past `DistSender`. - makes RangeStatsResponse.RangeInfo non-nullable. Co-authored-by: Nathan VanBenschoten <[email protected]>
Before this patch, the client's RangeDescriptorCache was not kept up to
date very well. For example, as long as a cached descriptor had the
leaseholder in it, that cache entry would not be updated even if other
replicas were stale. Besides generally wanting this cache to be as up to
date as possible, not keeping track of followers properly is a major
problem for routing follower reads requests which would constantly be
routed to non-existant replicas only to be rejected with a
RangeNotFoundError and then retried until eventually making its way to
the leaseholder - which leaseholder is kept up to date better.
This patch adds a mechanism for the range cache to be kept up to date by
having the client communicate its known descriptor generation and lease
sequence in all RPCs (in BatchRequest.Header), and have the server
respond with updated info if the client's view is stale. The updates
only come on successful RPCs, and they reuse the existing
BatchResponse.Header.RangeInfos field.
Fixes #44053
Release note (bug fix): CRDB now tracks the location of follower
replicas for all the ranges much better than previously, which means
that more queries will be successfully served as "follower reads".