Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storageccl,backupccl: paginate ExportRequest to control response size #57641

Merged
merged 1 commit into from
Jan 7, 2021

Conversation

adityamaru
Copy link
Contributor

This change adds pagination support to ExportRequest's which return the
produced SSTs as part of their response. This is necessary to prevent
OOMs.

The motivation for this change was the work being done to support
tenants BACKUPs. Since all export requests will be returning the SST
from KV to the processor, instead of writing directly to
ExternalStorage, we will be incurring additional copying/buffering of
the produced SSTs. It is important to have a knob to control these
response sizes. The change piggybacks on the existing TargetBytes
functionality baked into the DistSender.

Fixes: #57227

Release note: None

@adityamaru adityamaru requested review from dt, pbardea, ajwerner and a team December 7, 2020 16:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru
Copy link
Contributor Author

Will update the ReturnSST bool to use the flag once #57317 is merged. Will also add a tenant backup test to exercise the processor pagination when I update to using that flag.

@@ -69,6 +69,9 @@ message ResponseHeader {
// The spanning operation didn't finish because the key limit was
// exceeded.
RESUME_KEY_LIMIT = 1;
// The spanning operation didn't finish because the target byte limit was
// exceeded.
RESUME_BYTE_LIMIT = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we have byte-size limited scans already so I wonder if there's a reason they didn't introduce this already, and/or if it will cause confusion if we introduce it here but don't use it there? /cc @nvanbenschoten

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. We seem to be using the RESUME_KEY_LIMIT reason in places where this new RESUME_BYTE_LIMIT reason would be more appropriate, like:

resumeReason = roachpb.RESUME_KEY_LIMIT

My best guess as to why we didn't do this in 9e11347 is because we would then have needed to plumb this reason back out of MVCC operations. We don't seem to use the reason for anything important (or am I missing something?), so maybe we didn't think it was worth the effort? I'm not totally sure, @tbg might know better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I didn't see a strong enough reason to introduce it at the time.

PS: I don't have context on this PR.

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @nvanbenschoten, and @pbardea)


pkg/ccl/backupccl/backup_processor.go, line 60 at r1 (raw file):

// resulting from buffering/copying large SST files.
// TODO(adityamaru): tune this via experimentation.
const ExportRequestSSTTargetBytes = 128<<20 /* 128 MiB */

It might be nice to have this as a cluster setting as an escape hatch against very large rows? I also think that this value will have an interesting relationship with ExportRequestTargetFileSize. I could see surprising behaviour if this one is set below the other. I think it would overshoot? But now I'm wondering if they should be one and the same setting.


pkg/roachpb/api.proto, line 74 at r1 (raw file):

Previously, dt (David Taylor) wrote…

I know we have byte-size limited scans already so I wonder if there's a reason they didn't introduce this already, and/or if it will cause confusion if we introduce it here but don't use it there? /cc @nvanbenschoten

I also shared this concern -- it might be easier to introduce the RESUME_BYTE_LIMIT separately?

@adityamaru
Copy link
Contributor Author

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @nvanbenschoten, and @pbardea)

pkg/ccl/backupccl/backup_processor.go, line 60 at r1 (raw file):

// resulting from buffering/copying large SST files.
// TODO(adityamaru): tune this via experimentation.
const ExportRequestSSTTargetBytes = 128<<20 /* 128 MiB */

It might be nice to have this as a cluster setting as an escape hatch against very large rows? I also think that this value will have an interesting relationship with ExportRequestTargetFileSize. I could see surprising behaviour if this one is set below the other. I think it would overshoot? But now I'm wondering if they should be one and the same setting.

@pbardea I've been thinking about this and can't zero down on what the desired relationship between the two (or one) settings would be. If you set the response target size to less than the SSTTargetSize, the SST can definitely overshoot, because the response target size is not taken into account when creating the actual SST. So in the worst case you can get back a targetSST + overrage size SST even though you asked to cap off at something lower.

If you set the response target size to a factor of the SSTTargetSize, you could potentially fit multiple SSTs in a single response (depending on overrage). Lastly, if you merge the two into a single option, you will only be able to fit a single target size SST into the response every time. Is the ability to potentially send back multiple TargetSize SSTs in a single response, in other words not having to send a "resume" export request a big enough win that we keep two settings?

@pbardea pbardea self-requested a review January 5, 2021 15:13
@pbardea
Copy link
Contributor

pbardea commented Jan 5, 2021

I've been thinking about this and can't zero down on what the desired relationship between the two (or one) settings would be.

From my perspective, I think it makes the most sense to make this setting a cluster setting that specifies the export request SSTTargetByte size as a multiple of the other setting. That way changing ExportRequestTargetFileSize will not change the interaction between the settings, but they can each be individually tweaked. WDYT?

@adityamaru adityamaru force-pushed the export-request-pagination branch 2 times, most recently from 771c22f to 660c851 Compare January 6, 2021 19:23
@adityamaru
Copy link
Contributor Author

@pbardea this is RFAL. The new changes as per our discussion are:

  • Always setting TargetBytes to sentinel value of 1, so that we paginate after a single SST in ExportSST. This allows the user to control the size of the SST returned in the response with the already existing target and overrage cluster settings. This approach also matches the pagination semantics at the pebble level which always creates a single SST based on the above-mentioned cluster settings before paginating.
  • Progress is reported only one the entire span (all its ResumeSpans) have been processed. This is required for progress logging to work. A follow-up PR to cleanup how we do progress logging is on the way, but independent of this change.
  • I had to add logic to take the min of the system tenant targetSize cluster setting, and the targetSize passed in from the tenant because we do not support cluster settings within a tenant yet. After some discussion with @dt there is some follow-up work to be done where we will allow the tenant to pass in its targetSize and have a maxSize cluster setting on the system tenant to control how much RAM is used.

@adityamaru adityamaru force-pushed the export-request-pagination branch from 660c851 to 180b036 Compare January 6, 2021 20:07
Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/ccl/backupccl/backup_test.go Outdated Show resolved Hide resolved
pkg/ccl/storageccl/export_test.go Outdated Show resolved Hide resolved
This change adds pagination support to ExportRequest's which return the
produced SSTs as part of their response. This is necessary to prevent
OOMs.

The motivation for this change was the work being done to support
tenants BACKUPs. Since all export requests will be returning the SST
from KV to the processor, instead of writing directly to
ExternalStorage, we will be incurring additional copying/buffering of
the produced SSTs. It is important to have a knob to control these
response sizes. The change piggybacks on the existing `TargetBytes`
functionality baked into the `DistSender`.

Fixes: cockroachdb#57227

Release note: None
@adityamaru adityamaru force-pushed the export-request-pagination branch from 180b036 to 65617d1 Compare January 7, 2021 16:52
@adityamaru
Copy link
Contributor Author

TFTR!

bors r=pbardea

@craig
Copy link
Contributor

craig bot commented Jan 7, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backupccl: add pagination to export request
6 participants