-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: fix off by one index in fileSSTSink file extension #98041
Conversation
3c046b5
to
0ab3e4e
Compare
Currently, the logic that extends the last flushed file fileSSTSink does not trigger if there is only one flushed file. This failure to extend the first flushed file can result in file entries in the backup manifest with duplicate start keys. For example, if the first export response written to the sink contains partial entries of a single key `a`, then the span of the first file will be `a-a`, and the span of the subsequent file will always be `a-<end_key>`. The presence of these duplicate start keys breaks the encoding of the external manifest files list SST as the file path + start key combination in the manifest are assumed to be unique. Release note: None
bors r+ |
Build succeeded: |
@@ -190,7 +190,7 @@ func (s *fileSSTSink) write(ctx context.Context, resp exportedSpan) error { | |||
// If this span extended the last span added -- that is, picked up where it | |||
// ended and has the same time-bounds -- then we can simply extend that span | |||
// and add to its entry counts. Otherwise we need to record it separately. | |||
if l := len(s.flushedFiles) - 1; l > 0 && s.flushedFiles[l].Span.EndKey.Equal(span.Key) && | |||
if l := len(s.flushedFiles) - 1; l >= 0 && s.flushedFiles[l].Span.EndKey.Equal(span.Key) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this logic be expressed in a way that a future person won’t re-introduce an off-by-one?
- The
-1
isn’t great -- how can a reader infer why that’s there? - The
>=
isn’t great either. Most loop bounds should have a clean<
. - Maybe the several conditions on
s.flushedFiles[l]
should be inside the loop, with abreak
orcontinue
. - Some well-named local vars would increase clarity.
(Bikeshed, can this be something other than l
which looks like 1
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postmortem analysis of the above is kind of pointless unless we can replay the stream of thoughts exactly at the time OBOE was introduced. The choice of the mnemonic to denote the last position may have played some role, but I doubt that was the reason; i.e., l
usually stands for length
; I would have used last
. More likely, the issue is the cognitive dissonance of 0-based and 1-based indexing, both of which are prevalent. E.g., when we talk about nodes in the cluster, we refer to n1
as the first; there are plenty of other examples.
@rhu713 we should backport this to all release branches? |
blathers backport 22.2 22.1 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 6b39306 to blathers/backport-release-22.2-98041: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2 failed. See errors above. error creating merge commit from 6b39306 to blathers/backport-release-22.1-98041: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Currently, the logic that extends the last flushed file fileSSTSink does not trigger if there is only one flushed file. This failure to extend the first flushed file can result in file entries in the backup manifest with duplicate start keys. For example, if the first export response written to the sink contains partial entries of a single key
a
, then the span of the first file will bea-a
, and the span of the subsequent file will always bea-<end_key>
. The presence of these duplicate start keys breaks the encoding of the external manifest files list SST as the file path + start key combination in the manifest are assumed to be unique.Fixes #97953
Release note: None