-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: optimize inter-table spans for backup and ts protection #66010
base: master
Are you sure you want to change the base?
Conversation
10973b7
to
b500a04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @annezhu98 and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 82 at r1 (raw file):
} type tableAndSpan struct {
nit: rename to tableAndMergedSpans, and then rename the field to mergedSpans?
pkg/ccl/backupccl/backup_planning.go, line 181 at r1 (raw file):
// - Two non-contiguous index spans are merged if a scan request for the index // IDs between them does not return any results. // getLogicallyMergedTableSpans also returns the first and last index of the given table
nit: I don't think this is the case?
pkg/ccl/backupccl/backup_planning.go, line 296 at r1 (raw file):
// it does not make any changes to the output of getLogicallyMergedTableSpans except possibly the // first and last spans of each table func getLogicallyMergedInterTableSpans(
nit: maybe rename to getLogicallyMergedSpansAcrossTables
pkg/ccl/backupccl/backup_planning.go, line 308 at r1 (raw file):
} var mergedInterTableSpans []roachpb.Span
I think this method can be simplified. We want to perform 3 checks:
-
CheckForTablesInBounds(startTableID, endTableID)
: If we find any table descriptor between the LHS and RHS table ID we are considering for a merge, then we don't merge. This table descriptor would either be of an offline table, public table with no public indexes (not sure if this is possible) or table not included in the backup. -
`CheckForKVInBounds(lhs, rhs): This will scan the key range, and find any dropped but not gc'ed indexes. If this method returns true, abort merge. Here LHS is the EndKey of the last span in the tables' spans, and RHS is the StartKey of the first span in the next tables' spans.
3, check that the maxAddingIndexID for lhs (as defined above) is < lhs.indexID && minAddingIndexID for rhs > rhs.indexID.
Something like this for lhs, and similar for rhs but with the condition if idx.Adding() && idx.GetID() < lhs.indexID
:
canMerge := true
allPhysicalIndexOpts := catalog.IndexOpts{AddMutations: true}
if err := catalog.ForEachIndex(table, allPhysicalIndexOpts, func(idx catalog.Index) error {
if idx.GetID() < lhs.indexID { skip cause we don't care about these inexes }
if idx.Adding() && idx.GetID() > lhs.indexID {
canMerge = false
return nil
}
return nil
}
So then this method would check if any of these methods return true. If they did, we cannot merge the two table IDs in consideration. Take all the spans of the LHS table ID as is, and move on to consider the next two table IDs.
pkg/ccl/backupccl/backup_planning.go, line 412 at r1 (raw file):
// getIndexesInBounds returns all adding indexes and public indexes found in the given table getIndexesInBounds := func(table catalog.TableDescriptor) (
let's not inline this method. The only reason we inlined checkForKVInbounds
was so that we could inject our own implementation in the unit test. I don't think we need to do that here, so lets move this to a method.
Also, I would change the name to getPublicAndAddingIndexes
cause we don't really have any "bounds".
pkg/ccl/backupccl/backup_planning.go, line 439 at r1 (raw file):
} // checkForNonMVCCInvariants scans all tables between startTableID and endTableID and returns true iff for any table ID
for both these inline methods, see the above comment about how we can restructure getLogicallyMergedInterTableSpans. I'm okay injecting some of the helper methods described there so that we can unit test that method in isolation, but I think the logic can be simplified.
I'm also okay if you want to pull each of these out into helper methods while we iterate on the code, and once we are happy with them, we can make them inline.
pkg/ccl/backupccl/backup_planning.go, line 528 at r1 (raw file):
if _, ok := tableSpans[tableID]; !ok { tableIDs = append(tableIDs, tableID) }
Do we expect to see the same tableID twice? I believe all tables in tables
are unique descriptors? I'd understand this check for revs below cause you can have multiple revisions of a table since the last backup, but I'm not sure I see the need for this check here?
pkg/ccl/backupccl/backup_test.go, line 6305 at r1 (raw file):
tableID2 := getTableID(db, "test", "foo2") require.Equal(t, []string{fmt.Sprintf("/Table/{%d/1-%d/4}", tableID, tableID2)}, actualResolvedSpans)
i'm going to hold off looking at tests until we are happy with the code. We can then discuss how we want to remodel some of the unit tests!
Nice work on this first stab @annezhu98! I've done the first pass, but I want to do a few more once we chat and restructure the code a little. |
391c724
to
67eb4f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 308 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
I think this method can be simplified. We want to perform 3 checks:
CheckForTablesInBounds(startTableID, endTableID)
: If we find any table descriptor between the LHS and RHS table ID we are considering for a merge, then we don't merge. This table descriptor would either be of an offline table, public table with no public indexes (not sure if this is possible) or table not included in the backup.`CheckForKVInBounds(lhs, rhs): This will scan the key range, and find any dropped but not gc'ed indexes. If this method returns true, abort merge. Here LHS is the EndKey of the last span in the tables' spans, and RHS is the StartKey of the first span in the next tables' spans.
3, check that the maxAddingIndexID for lhs (as defined above) is < lhs.indexID && minAddingIndexID for rhs > rhs.indexID.
Something like this for lhs, and similar for rhs but with the condition
if idx.Adding() && idx.GetID() < lhs.indexID
:canMerge := true allPhysicalIndexOpts := catalog.IndexOpts{AddMutations: true} if err := catalog.ForEachIndex(table, allPhysicalIndexOpts, func(idx catalog.Index) error { if idx.GetID() < lhs.indexID { skip cause we don't care about these inexes } if idx.Adding() && idx.GetID() > lhs.indexID { canMerge = false return nil } return nil }
So then this method would check if any of these methods return true. If they did, we cannot merge the two table IDs in consideration. Take all the spans of the LHS table ID as is, and move on to consider the next two table IDs.
Done.
pkg/ccl/backupccl/backup_planning.go, line 412 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
let's not inline this method. The only reason we inlined
checkForKVInbounds
was so that we could inject our own implementation in the unit test. I don't think we need to do that here, so lets move this to a method.Also, I would change the name to
getPublicAndAddingIndexes
cause we don't really have any "bounds".
Done.
pkg/ccl/backupccl/backup_planning.go, line 439 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
for both these inline methods, see the above comment about how we can restructure getLogicallyMergedInterTableSpans. I'm okay injecting some of the helper methods described there so that we can unit test that method in isolation, but I think the logic can be simplified.
I'm also okay if you want to pull each of these out into helper methods while we iterate on the code, and once we are happy with them, we can make them inline.
kept checkForKVInbounds
and checkForTablesInBounds
inline as we need to override them for unit tests
pkg/ccl/backupccl/backup_planning.go, line 528 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
if _, ok := tableSpans[tableID]; !ok { tableIDs = append(tableIDs, tableID) }
Do we expect to see the same tableID twice? I believe all tables in
tables
are unique descriptors? I'd understand this check for revs below cause you can have multiple revisions of a table since the last backup, but I'm not sure I see the need for this check here?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done another pass, it's getting there! Thanks for sticking with this 🙂
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 412 at r1 (raw file):
Previously, annezhu98 (Anne Zhu) wrote…
Done.
hmm, how do you feel about putting this back inside getLogicallyMergedSpans
(as it was prior to this PR) just to reduce the diff. I think earlier, we expected to use this method in our new logic, but I don't think we do?
pkg/ccl/backupccl/backup_planning.go, line 189 at r2 (raw file):
codec keys.SQLCodec, endTime hlc.Timestamp, added map[tableAndIndex]bool,
can we revert this change, doesn't seem needed anymore.
pkg/ccl/backupccl/backup_planning.go, line 314 at r2 (raw file):
rhsTableID := tableIDs[curIndex+1] lhs := spans[lhsTableID]
nit: do we need lhs, rhs? Can we just store lhsSpan, and rhsSpan and rename them to:
lastSpanInLHSMergedSpans := lhs.mergedSpans[len(lhs.mergedSpans)-1]
firstSpanInRHSMergedSpans := rhs.mergedSpans[0]
pkg/ccl/backupccl/backup_planning.go, line 321 at r2 (raw file):
Quoted 6 lines of code…
// LHS and RHS are safe to merge iff the following MVCC and Non-MVCC invariants hold: // 1) MVCC: no dropping/adding indexes were found from the EndKey of the last span // on the lhs table to the StartKey of the first span on the rhs table // 2) Non-MVCC: we can safely merge lhs and rhs if they're contiguous or there exists // dropped and gc'ed tables between them. We should not merge if we are able // to fetch a table descriptor for any given table ID that is between lhs and rhs.
lets delete this comment and add a comment above each if check indicating what it is checking for?
-
Check for any tables between the table of the LHS and RHS span being considered for a merge. If found, we do not want to merge as the backup did not resolve the table as a target. Currently, this could either be because the table was not specified in the targets, or it was offline.
-
Check for any keys between the EndKey between the EndKey of the LHS span, and StartKey of the RHS span. If found, we do not want to merge as the backup will include unexpected data. These could be keys corresponding to tables that have been dropped, but not gc'ed.
-
Check for any adding indexes in the tables of the LHS and RHS that might be unexpectedly picked up
pkg/ccl/backupccl/backup_planning.go, line 327 at r2 (raw file):
// dropped and gc'ed tables between them. We should not merge if we are able // to fetch a table descriptor for any given table ID that is between lhs and rhs. for i := 0; i < 3 && canMerge; i++ {
any reason we can't restructure this as:
if foundTable, err := checkForTablesInBounds(lhsTableID, rhsTableID); foundTable {
mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans...)
continue
}
if foundKVInBounds, err := checkForKVInBounds(lhsSpan.EndKey, rhsSpan.Key, endTime) {
mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans...)
continue
}
if foundAddingIndexes, err := checkForAddingIndexes(...) {
mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans...)
continue
}
I personally find it easier to read at a glance? Less bool bookkeeping too 🙂
pkg/ccl/backupccl/backup_planning.go, line 346 at r2 (raw file):
if canMerge { mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans[:len(lhs.mergedSpans)-1]...)
lets add a comment about what we're doing here. Something that captures that:
If a merge is possible, we add everything but the last span on the LHS, unchanged, to the result slice. We then modify the first span on the RHS to include the last span on the LHS, thereby performing a merge. This merged span will be added to the result slice in future iterations.
pkg/ccl/backupccl/backup_planning.go, line 352 at r2 (raw file):
if curIndex == len(tableIDs)-2 { mergedInterTableSpans = append(mergedInterTableSpans, rhs.mergedSpans...) }
might need to pull this out of the loop if we go with the continue
approach above?
pkg/ccl/backupccl/backup_planning.go, line 487 at r2 (raw file):
// checkForTablesInBounds scans all tables between startTableID and endTableID and returns true iff // a non-IsUndefinedRelationError was returned for a table ID
nit: returns true if it successfully resolves a table in (startTableID, endTableID).
Also, should the check start at startTableID+1
?
pkg/ccl/backupccl/backup_planning.go, line 490 at r2 (raw file):
checkForTablesInBounds := func(startTableID, endTableID descpb.ID) (bool, error) { for tableID := startTableID; tableID < endTableID; tableID++ { _, err := p.LookupTableByID(ctx, tableID)
I think this might need a little bit of restructuring to be safer.
_, err :-= p.LookupTableByID(ctx, tableID)
// We found a table, return true.
if err == nil {
return true, nil
}
// We got an error that was not a "table not found error". Return that error.
if !sqlerrors.IsUndefinedRelationError(err) {
return true, err
}
// No table was found, safe to continue checking.
pkg/ccl/backupccl/backup_planning.go, line 507 at r2 (raw file):
execCfg.Codec, endTime, added,
nit: can we revert this change to the original, will reduce the scope of this change.
pkg/ccl/backupccl/backup_planning.go, line 536 at r2 (raw file):
) if err != nil { return nil, err
just a thought, since this is strictly an optimization, maybe we shouldn't fail the whole backup if this method returns an error and just continue with mergedIndexSpans
. Curious what @pbardea thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 412 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
hmm, how do you feel about putting this back inside
getLogicallyMergedSpans
(as it was prior to this PR) just to reduce the diff. I think earlier, we expected to use this method in our new logic, but I don't think we do?
Done.
pkg/ccl/backupccl/backup_planning.go, line 189 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
can we revert this change, doesn't seem needed anymore.
Done.
pkg/ccl/backupccl/backup_planning.go, line 314 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
nit: do we need lhs, rhs? Can we just store lhsSpan, and rhsSpan and rename them to:
lastSpanInLHSMergedSpans := lhs.mergedSpans[len(lhs.mergedSpans)-1]
firstSpanInRHSMergedSpans := rhs.mergedSpans[0]
Done. I kept lhs and rhs since they are being referenced a few times in the code.
pkg/ccl/backupccl/backup_planning.go, line 321 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
// LHS and RHS are safe to merge iff the following MVCC and Non-MVCC invariants hold: // 1) MVCC: no dropping/adding indexes were found from the EndKey of the last span // on the lhs table to the StartKey of the first span on the rhs table // 2) Non-MVCC: we can safely merge lhs and rhs if they're contiguous or there exists // dropped and gc'ed tables between them. We should not merge if we are able // to fetch a table descriptor for any given table ID that is between lhs and rhs.
lets delete this comment and add a comment above each if check indicating what it is checking for?
Check for any tables between the table of the LHS and RHS span being considered for a merge. If found, we do not want to merge as the backup did not resolve the table as a target. Currently, this could either be because the table was not specified in the targets, or it was offline.
Check for any keys between the EndKey between the EndKey of the LHS span, and StartKey of the RHS span. If found, we do not want to merge as the backup will include unexpected data. These could be keys corresponding to tables that have been dropped, but not gc'ed.
Check for any adding indexes in the tables of the LHS and RHS that might be unexpectedly picked up
Done.
pkg/ccl/backupccl/backup_planning.go, line 327 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
any reason we can't restructure this as:
if foundTable, err := checkForTablesInBounds(lhsTableID, rhsTableID); foundTable {
mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans...)
continue
}if foundKVInBounds, err := checkForKVInBounds(lhsSpan.EndKey, rhsSpan.Key, endTime) {
mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans...)
continue
}if foundAddingIndexes, err := checkForAddingIndexes(...) {
mergedInterTableSpans = append(mergedInterTableSpans, lhs.mergedSpans...)
continue
}I personally find it easier to read at a glance? Less bool bookkeeping too 🙂
Done.
pkg/ccl/backupccl/backup_planning.go, line 346 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
lets add a comment about what we're doing here. Something that captures that:
If a merge is possible, we add everything but the last span on the LHS, unchanged, to the result slice. We then modify the first span on the RHS to include the last span on the LHS, thereby performing a merge. This merged span will be added to the result slice in future iterations.
Done.
pkg/ccl/backupccl/backup_planning.go, line 352 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
if curIndex == len(tableIDs)-2 { mergedInterTableSpans = append(mergedInterTableSpans, rhs.mergedSpans...) }
might need to pull this out of the loop if we go with the
continue
approach above?
Done.
pkg/ccl/backupccl/backup_planning.go, line 490 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
I think this might need a little bit of restructuring to be safer.
_, err :-= p.LookupTableByID(ctx, tableID)
// We found a table, return true.
if err == nil {
return true, nil
}
// We got an error that was not a "table not found error". Return that error.
if !sqlerrors.IsUndefinedRelationError(err) {
return true, err
}
// No table was found, safe to continue checking.
Done.
pkg/ccl/backupccl/backup_planning.go, line 507 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
nit: can we revert this change to the original, will reduce the scope of this change.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 536 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
just a thought, since this is strictly an optimization, maybe we shouldn't fail the whole backup if this method returns an error and just continue with
mergedIndexSpans
. Curious what @pbardea thinks.
It seems like the only place where an error would be returned is when doing the 3 checks, I changed it up a bit so that whenever an error is returned, we just continue with mergedIndexSpans
without merging.
dba47a0
to
f4bd801
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 536 at r2 (raw file):
Previously, annezhu98 (Anne Zhu) wrote…
It seems like the only place where an error would be returned is when doing the 3 checks, I changed it up a bit so that whenever an error is returned, we just continue with
mergedIndexSpans
without merging.
lets modify this call site to account for an error here and log it before ignoring it.
pkg/ccl/backupccl/backup_planning.go, line 332 at r3 (raw file):
firstSpanInRHSMergedSpans := rhs.mergedSpans[0] // Check for any tables between the table of the LHS and RHS span being considered for a merge.
nit: we usually wrap comments at 80 characters. There's a way to configure this in goland (we can talk offline).
pkg/ccl/backupccl/backup_planning.go, line 335 at r3 (raw file):
// If found, we do not want to merge as the backup did not resolve the table as a target. // Currently, this could either be because the table was not specified in the targets, or it was offline. if foundTable, err := checkForTablesInBounds(tableIDs[curIndex], tableIDs[curIndex+1]); foundTable || err != nil {
here, and in the checks below, im not sure we should be swallowing the error. If we do want to fallback to using the spans returned by getLogicallyMergedSpans, that should be done outside of this method. I do think if err != nil, we should return the error and allow the user of this method to deal with it.
pkg/ccl/backupccl/backup_planning.go, line 340 at r3 (raw file):
} // Check for any keys between the EndKey between the EndKey of the LHS span, and StartKey of the RHS span.
nit: repetition
pkg/ccl/backupccl/backup_planning.go, line 348 at r3 (raw file):
} // Check for any adding indexes in the tables of the LHS and RHS that might be unexpectedly picked up
nit: modify comment to - Check for adding indexes as described in the comment above checkForAddingIndexes
. The presence of such an index makes the LHS and RHS unmergeable as the BACKUP could unexpectedly pick up non-public index entries.
pkg/ccl/backupccl/backup_planning.go, line 364 at r3 (raw file):
} // checkForAddingIndexes returns true iff adding indexes were found after the end key of the last span on lhsTable
nit: wrap comment
pkg/ccl/backupccl/backup_planning.go, line 465 at r3 (raw file):
// checkForTablesInBounds returns true iff it successfully resolves a table // between startTableID and endTableID (both exclusive)
nit: full stop after comments (here and everywhere:slightly_smiling_face:)
pkg/ccl/backupccl/backup_planning.go, line 393 at r4 (raw file):
key := tableAndIndex{tableID: lhs.table.GetID(), indexID: idx.GetID()} isAdding := added[key] || idx.Adding() if isAdding && idx.GetID() >= lhsIndexID {
should this be idx.GetID() > lhsIndexID?
pkg/ccl/backupccl/backup_planning.go, line 395 at r4 (raw file):
if isAdding && idx.GetID() >= lhsIndexID { foundAddingIndex = true added[key] = true
i'm not sure I understand the need for added in this method?
pkg/ccl/backupccl/backup_planning.go, line 432 at r4 (raw file):
Quoted 9 lines of code…
lhsIndex, err := lhs.table.FindIndexWithID(lhsIndexID - 1) if err != nil { return false, err } rhsIndex, err := rhs.table.FindIndexWithID(rhsIndexID) if err != nil { return false, err } return lhsIndex.IsInterleaved() || rhsIndex.IsInterleaved(), nil
I believe that by the time this code is released, we will not have interleaved tables so we can delete this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_test.go, line 6494 at r4 (raw file):
unusedMap := make(map[tableAndIndex]bool) checkForTablesInBoundsOverride := func(startTableID, endTableID descpb.ID) (bool, error) {
lets move this inline for every testcase.
pkg/ccl/backupccl/backup_test.go, line 6502 at r4 (raw file):
testCases := []struct { name string
nit: instead of have a xxx1 and xxx2 for each field, can we define a struct tableDef
with id, pkIndex, indexes, adding, and dropping as fields. Then each test case can have a tables []tableDef
.
We can then have test cases that test merging more than just 2 tables.
pkg/ccl/backupccl/backup_test.go, line 6530 at r4 (raw file):
}, { name: "dropped-index-before-start-key-on-rhs-span",
fix based on offline discussion. Potentially, index 4 on table 65 should return true on a scan, so that we do not merge the two table spans.
pkg/ccl/backupccl/backup_test.go, line 6544 at r4 (raw file):
}, { name: "adding-index-between-contiguous-tables",
rename test to "adding-index-after-lhs", and also add a testcase for "adding-index-before-rhs"
pkg/ccl/backupccl/backup_test.go, line 6559 at r4 (raw file):
}, { name: "offline-table-between-spans",
inline the checkForTableInBounds
method to return true for tableID 70.
pkg/ccl/backupccl/backup_test.go, line 6586 at r4 (raw file):
expectedSpans: []string{"/Table/7{2/1-4/5}"}, }, }
some more test ideas:
- merging multiple contiguous tables
- interleaving mergeable and non-mergeable spans. egs: table1, offline table2, table3, dropped and GC'ed table 4, table5
This should not merge 1 and 3, but should merge 3 to 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 536 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
lets modify this call site to account for an error here and log it before ignoring it.
Done.
pkg/ccl/backupccl/backup_planning.go, line 335 at r3 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
here, and in the checks below, im not sure we should be swallowing the error. If we do want to fallback to using the spans returned by getLogicallyMergedSpans, that should be done outside of this method. I do think if err != nil, we should return the error and allow the user of this method to deal with it.
Done.
pkg/ccl/backupccl/backup_planning.go, line 393 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
should this be idx.GetID() > lhsIndexID?
lhsIndexID
is the index ID of a span's EndKey
, which is exclusive, hence whylhsIndexID
should be checked as well.
pkg/ccl/backupccl/backup_planning.go, line 395 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
i'm not sure I understand the need for added in this method?
We need this here to keep track of adding indexes found in `getLogicallyMergedTableSpans.
pkg/ccl/backupccl/backup_planning.go, line 432 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
lhsIndex, err := lhs.table.FindIndexWithID(lhsIndexID - 1) if err != nil { return false, err } rhsIndex, err := rhs.table.FindIndexWithID(rhsIndexID) if err != nil { return false, err } return lhsIndex.IsInterleaved() || rhsIndex.IsInterleaved(), nil
I believe that by the time this code is released, we will not have interleaved tables so we can delete this logic.
Done.
pkg/ccl/backupccl/backup_test.go, line 6305 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
i'm going to hold off looking at tests until we are happy with the code. We can then discuss how we want to remodel some of the unit tests!
Done.
pkg/ccl/backupccl/backup_test.go, line 6494 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
lets move this inline for every testcase.
Done.
pkg/ccl/backupccl/backup_test.go, line 6530 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
fix based on offline discussion. Potentially, index 4 on table 65 should return true on a scan, so that we do not merge the two table spans.
Done.
pkg/ccl/backupccl/backup_test.go, line 6544 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
rename test to "adding-index-after-lhs", and also add a testcase for "adding-index-before-rhs"
Done.
pkg/ccl/backupccl/backup_test.go, line 6559 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
inline the
checkForTableInBounds
method to return true for tableID 70.
Done.
pkg/ccl/backupccl/backup_test.go, line 6586 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
some more test ideas:
- merging multiple contiguous tables
- interleaving mergeable and non-mergeable spans. egs: table1, offline table2, table3, dropped and GC'ed table 4, table5
This should not merge 1 and 3, but should merge 3 to 5.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After debugging around with some of the span-merging logic, I realized that I may have missed some considerations for revision_history backups. Curious about what your thoughts are here @adityamaru and @annezhu98. I think the table existence and KV checks would still be valuable (since we don't want to protect huge swaths of unrelated tables when doing database backups), but separating the set of spans we protect and backup could give us more leeway on the more fragile checks (such as those that look for Adding indexes)
We could also have a sanity check that the protectedTS spans are a superset of those that we're backing up.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @annezhu98)
pkg/ccl/backupccl/backup_planning.go, line 314 at r5 (raw file):
// each table func getLogicallyMergedSpansAcrossTables( tableIDs []descpb.ID,
Since tableIDs should always be the keys of the spans
, it would probably be safer to create tableIDs
inside this method to ensure that this assumption is always the case (rather than needing to check it at all callsites). If the concern is the cost of re-sorting the list of tableIDs I think it would be useful to at least add a comment about the relationship between the 2 args.
pkg/ccl/backupccl/backup_planning.go, line 354 at r5 (raw file):
// include unexpected data. These could be keys corresponding to tables // that have been dropped, but not gc'ed. if foundKVInBounds, err := checkForKVInBounds(lastSpanInLHSMergedSpans.EndKey, firstSpanInRHSMergedSpans.Key, endTime); foundKVInBounds || err != nil {
might be clearer to read as:
if foundKVInBounds, err := ...; err != nil {
return nil, err
} else if foundKVInBounds {
...
}
here (and above and below)
pkg/ccl/backupccl/backup_planning.go, line 389 at r5 (raw file):
// end key of the last span on lhsTable or before the start key of the first // span of the rhsTable func checkForAddingIndexes(
I worry about interactions between revision_history backups and this method. This method will always resolve the current version of the descriptor, which might not be what we want when considering descriptor revisions. Debugging logic around merging indexes on revision_history backups this week started to make me wonder if we can loosen the checks that we perform here and maintain a broader set of spans whose timestamps we'd protect in parallel to the set of spans we'd backup...
Sorry for not catching this earlier; but debugging this week has made me more aware to the sensitivity of this merging...
pkg/ccl/backupccl/backup_test.go, line 6499 at r5 (raw file):
func TestLogicallyMergedSpansAcrossTables(t *testing.T) { defer leaktest.AfterTest(t)() codec := keys.TODOSQLCodec
This could probably use keys.SystemSQLCodec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @annezhu98, and @pbardea)
pkg/ccl/backupccl/backup_planning.go, line 314 at r5 (raw file):
Previously, pbardea (Paul Bardea) wrote…
Since tableIDs should always be the keys of the
spans
, it would probably be safer to createtableIDs
inside this method to ensure that this assumption is always the case (rather than needing to check it at all callsites). If the concern is the cost of re-sorting the list of tableIDs I think it would be useful to at least add a comment about the relationship between the 2 args.
Left a comment with the reasoning for the 2 args.
pkg/ccl/backupccl/backup_planning.go, line 354 at r5 (raw file):
Previously, pbardea (Paul Bardea) wrote…
might be clearer to read as:
if foundKVInBounds, err := ...; err != nil { return nil, err } else if foundKVInBounds { ... }
here (and above and below)
Done.
pkg/ccl/backupccl/backup_test.go, line 6499 at r5 (raw file):
Previously, pbardea (Paul Bardea) wrote…
This could probably use
keys.SystemSQLCodec
Done.
…tection Previously, within each table, contiguous index spans are merged if there are no dropped or adding indexes between them. However, when a BACKUP involves a large number of tables, the protected ts cluster limit is still being hit very frequently. To address this issue, this commit would logically merge spans between tables to reduce the number of spans generated, further optimizing from previous work cockroachdb#55794. To determine if two tables can be merged we need to check if the following MVCC and Non-MVCC invariants are met: * No dropping or adding indexes between LHS and RHS * LHS and RHS are either contiguous tables, or non-contiguous and all tables between them have been dropped and gc'ed Resolves: cockroachdb#58134 Informs: cockroachdb#54747 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry -- this fell off my review queue. I think I've convinced myself that this is actually okay in the face of descriptor revisions, but adding some test cases that test descriptors with multiple versions of the same ID would be useful here.
Previously, within each table, contiguous index spans are merged if there are no dropped or adding indexes between them. However, when a BACKUP involves a large number of tables, the protected ts cluster limit is still being hit very frequently.
To address this issue, this commit would logically merge spans between tables to reduce the number of spans generated, further optimizing from previous work #55794.
To determine if two tables can be merged we need to check if the following MVCC and Non-MVCC invariants are met:
Resolves: #58134
Informs: #54747
Release note: None