-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccl/backupccl: add new split and scatter processor that generates import spans #94805
Conversation
905db07
to
97a092f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
After a first pass, I'm not feeling too scared about backports here, since it seems like the new processor is mostly just moving from restore_job and the old proc and making it look like a proc.
The new covering generation in makeStreamingImportSpans looks like it has more net-new/different functionality, vs just being moved to run in a different place so that's where I think we want the most scrutiny when we think of a backport. I mentioned an idea below to allow toggling back and forth. I think I could also maybe have benefited from a little more commentary, either inline or maybe just a little prose block comment up front about the new approach, how it assembles a span, what an "interesting" point is? Or maybe I'm just being dense
pkg/ccl/backupccl/restore_job.go
Outdated
highWaterMark int64 | ||
ceiling int64 | ||
res roachpb.RowCount | ||
importSpans map[int64]roachpb.Span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at first I got a little confused thinking this was all the import spans, and thus would oom, and seeing progress reports have their key compared to its content made me think "oh yeah, it must be holding all the import spans, that is scary."
but then I think I realized on closer reading that it is only "in-flight" import spans: we generate some by pulling from the channel until our ceiling is high enough to match any progress we've heard, then we compare that our ith span's key matches the reported complete i'th span's reported key, and then we empty out that map entry, so the map should only have entires for unreported spans below the max-reported span? is that right?
if so a comment, and maybe naming this something like inflightImportSpans or something, might make this more obvious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's right, once all spans up to index N
are completed, then the highWaterMark will be moved up to N
and everything below N
will be removed from both maps.
I've done the rename and added a comment describing this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
54bf2d3
to
0d0a96a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! I've added a toggle back to makeSimpleImportSpans and added additional commentary about the new import span generation.
pkg/ccl/backupccl/restore_job.go
Outdated
highWaterMark int64 | ||
ceiling int64 | ||
res roachpb.RowCount | ||
importSpans map[int64]roachpb.Span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's right, once all spans up to index N
are completed, then the highWaterMark will be moved up to N
and everything below N
will be removed from both maps.
I've done the rename and added a comment describing this behavior.
e104f97
to
35ebc76
Compare
bors r+ |
Build failed: |
…ort spans Previously, restore creates all of its import spans all at once and stores them in memory. OOMs caused by the size of these import spans on restore of large backups with many incremental chains has been the cause of many escalations. This patch modifies import span creation so that import spans are generated one at a time. This span generator then used in the split and scatter processor to generate the import spans that are used in the rest of restore instead of having the spans specified in the processor's spec. A future patch will add memory monitoring to the import span generation to further safeguard against OOMs in restore. This patch also changes the import span generation algorithm. The cluster setting `bulkio.restore.use_simple_import_spans` is introduced in this patch, which, if set to true, will revert the algorithm back to makeSimpleImportSpans. Release note: None
35ebc76
to
f3e0327
Compare
bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from f3e0327 to blathers/backport-release-22.2-94805: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
This has a warning
|
@rhu713 / @dt This PR (as far as I can tell) added ~4000+ tests to |
@healthy-pod So it isn't so much that someone wrote 4000 tests, but that we had 4000 test cases and we now test each case in two different configurations. That said, it does appear that this test got much more than twice (50-60x) slower. |
Makes sense thanks for explaining |
Previously, restore creates all of its import spans all at once and stores them
in memory. OOMs caused by the size of these import spans on restore of large
backups with many incremental chains has been the cause of many escalations.
This patch modifies import span creation so that import spans are generated one
at a time. This span generator then used in the split and scatter processor to
generate the import spans that are used in the rest of restore instead of
having the spans specified in the processor's spec. A future patch will add
memory monitoring to the import span generation to further safeguard against
OOMs in restore.
This patch also changes the import span generation algorithm. The cluster
setting
bulkio.restore.use_simple_import_spans
is introduced in this patch,which, if set to true, will revert the algorithm back to makeSimpleImportSpans.
Release note: None