release-22.2: move alloc heavy Files field from manifest to SST, use slim manifest in backup restore #97210

rhu713 · 2023-02-15T20:35:06Z

Backport:

2/2 commits from "backupccl: move alloc heavy Files field from manifest to SST" (backupccl: move alloc heavy Files field from manifest to SST #93997)
1/1 commits from "ccl/backupccl: add new split and scatter processor that generates import spans" (ccl/backupccl: add new split and scatter processor that generates import spans #94805)
1/1 commits from "backupccl: fix key rewriter race in generative split and scatter processor" (backupccl: fix key rewriter race in generative split and scatter processor #96313)
1/1 commits from "backupccl: add missing context cancel checks in gen split scatter processor" (backupccl: add missing context cancel checks in gen split scatter processor #96529)
1/1 commits from "backupccl: add missing context cancel checks to restore" (backupccl: add missing context cancel checks to restore #96302)
1/1 commits from "backupccl: move descriptors and descriptor changes field from manifest to SST" (backupccl: move descriptors and descriptor changes field from manifest to SST #96245)

Please see individual PRs for details.

/cc @cockroachdb/release

Release justification: high priority need for the memory reduction in backup and restore

Repeated fields in a backup's manifest do not scale well as the amount of backed up data, and the length of the incremental chain of backup grows. This has been a problem we have been aware of, and has motivated us to incrementally move all repeated fields out of the manifest and into their standalone metadata SST files. The advantage of this is that during incremental backups or restores we do not need to perform large allocations when unmarshalling the manifest, but instead stream results from the relevant SST as and when we need them. In support issues such as cockroachdb#93272 we have seen this unmarshalling step results in OOMs thereby preventing further incremental backups, or making the backups unrestoreable. Efforts for moving backup, and restore to have all their metadata in SSTs and rely on streaming reads and writes is ongoing but outside the scope of this patch. This patch is meant to be a targetted fix with an eye for backports. Past experimentation has shown us that the `Files` repeated field in the manifest is the largest cause of bloated, unmarshalable manifests. This change teaches backup to continue writing a manifest file, but a slimmer one with the `Files` field nil'ed out. The values in the `Files` field are instead written to an SST file that sits alongside the `SLIM_BACKUP_MANIFEST`. To maintain mixed-version compatability with nodes that rely on a regular manifest, we continue to write a `BACKUP_MANIFEST` alongside its slim version. On the read path, we add an optimization that first reads the slim manifest if present. This way we avoid unmarshalling the alloc heavy `Files` field, and instead teach all the places in the code that need the `Files` to reach out to the metadata SST and read the values one by one. To support both the slim and not-so-slim manifests we introduce an interface that iterates over the Files depending on the manifest passed to it. To reiterate, this work is a subset of the improvements we will get from moving all repeated fields to SSTs and is expected to be superseded by those efforts when they come to fruition. Fixes: cockroachdb#93272 Release note (performance improvement): long chains of incremental backups and restore of such chains will now allocate less memory during the unmarshaling of metadata

Release note: None

…ort spans Previously, restore creates all of its import spans all at once and stores them in memory. OOMs caused by the size of these import spans on restore of large backups with many incremental chains has been the cause of many escalations. This patch modifies import span creation so that import spans are generated one at a time. This span generator then used in the split and scatter processor to generate the import spans that are used in the rest of restore instead of having the spans specified in the processor's spec. A future patch will add memory monitoring to the import span generation to further safeguard against OOMs in restore. This patch also changes the import span generation algorithm. The cluster setting `bulkio.restore.use_simple_import_spans` is introduced in this patch, which, if set to true, will revert the algorithm back to makeSimpleImportSpans. Release note: None

…essor The generative split and scatter processor is currently causing tests to fail under race because there are many goroutines that are operating with the same splitAndScatterer, which cannot be used concurrently as the underlying key rewriter cannot be used concurrently. Modify the processor so that every worker that uses the splitAndScatterer now uses its own instance. Fixes: cockroachdb#95808 Release note: None

…cessor Add the rest of the missing context cancel checks in restore's generativeSplitAndScatterProcessor. Add a red/green test to show that runGenerativeSplitAndScatter is interrupted if its supplied context is canceled. Fixes: cockroachdb#95257 Release note: None

In cockroachdb#95257 we saw a restore grind to a halt 2 hours into a 5 hour roachtest. The stacks indicated that we may have seen a context cancellation that was not being respected by the goroutine running `generateAndSendImportSpans`. This resulted in the `generative_split_and_scatter_processor` getting stuck writing to a channel nobody was reading from (https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/backupccl/restore_span_covering.go#L516) since the other goroutines in the processor had seen the ctx cancellation and exited. A side effect of the generative processor not shutting down was that the downstream restore data processors would also hang on their call to `input.Next()` as they would not receive a row or a meta from the generative processor signalling them to shutdown. This fix adds a ctx cancellation check to the goroutine described above, thereby allowing a graceful teardown of the flow. This fix also adds the JobID to the generative processor spec so that logs on remote nodes are correctly tagged with the JobID making for easier debugging. Informs: cockroachdb#95257 Release note (bug fix): fixes a bug where a restore flow could hang indefinitely in the face of a context cancellation, manifesting as a stuck restore job.

blathers-crl · 2023-02-15T20:35:09Z

cockroach-teamcity · 2023-02-15T20:35:21Z

This change is

…t to SST As part of an effort to make backup manifests scale better for larger clusters, this patch moves descriptors and descriptor changes from the manifest to an external SST. This avoids the need to alloc enough memory to hold every descriptor and descriptor revision for every layer of a backup during a backup or restore job. This patch also changes the access pattern for descriptors and descriptor changes to use iterators, so that they can be accessed in a streaming manner from the external SST. Release note: None

adityamaru · 2023-02-16T14:50:00Z

Thanks for doing this, the list of commits you have here are all the ones I could think of as well. It looks like the backport was clean and there is nothing new except for the cluster setting to default external manifest SSTs to off?

Given the size of this backport I'm wondering if we should:

Run the 200inc restore on 22.2 as well
Run a mixed version database/table backup+restore
Hopefully our nightlies will shake out other things

The next release 22.2.6 is scheduled for Mar 7, 2023 so hopefully, we can get all these smoke tests done before then.

adityamaru · 2023-02-18T10:10:25Z

I think it's better to get this in for some bake time, unless @dt has any comments. We can follow it up with a mixed version test and the big 200 layer test.

…-93997-94805-96313-96529-96302-96245" This reverts commit b9fbeca, reversing changes made to c390c0c.

adityamaru and others added 6 commits February 15, 2023 15:29

backupccl:SHOW BACKUP should read Files from the slim manifest and SST

b06dc0f

Release note: None

rhu713 force-pushed the backport22.2-93997-94805-96313-96529-96302-96245 branch from 70f2e81 to 63c6057 Compare February 15, 2023 21:57

rhu713 requested a review from adityamaru February 15, 2023 21:57

rhu713 marked this pull request as ready for review February 15, 2023 22:03

rhu713 requested review from a team as code owners February 15, 2023 22:03

rhu713 requested review from mgartner and removed request for a team and mgartner February 15, 2023 22:03

rhu713 force-pushed the backport22.2-93997-94805-96313-96529-96302-96245 branch 4 times, most recently from e1c15b8 to a054962 Compare February 15, 2023 23:34

rhu713 force-pushed the backport22.2-93997-94805-96313-96529-96302-96245 branch from a054962 to e12b1f1 Compare February 16, 2023 00:10

adityamaru requested a review from dt February 16, 2023 14:28

adityamaru approved these changes Feb 18, 2023

View reviewed changes

dt approved these changes Feb 21, 2023

View reviewed changes

rhu713 merged commit b9fbeca into cockroachdb:release-22.2 Feb 21, 2023

cockroach-teamcity mentioned this pull request Feb 22, 2023

PR #97210 - backupccl: move alloc heavy Files field from manifest to SST cockroachdb/docs#16311

Closed

msbutler mentioned this pull request Feb 23, 2023

roachtest: restore/tpce/8TB/aws/nodes=10/cpus=8 failed #97019

Closed

adityamaru mentioned this pull request Mar 2, 2023

backupccl: duplicate Files in the BackupManifest result in a pebble SSTWriter error #97953

Closed

adityamaru mentioned this pull request Mar 10, 2023

release-22.2: backupccl: move alloc heavy Files field from manifest to SST #95001

Closed

msbutler mentioned this pull request Mar 17, 2023

backupccl: restore/pause/* roachtest fails on checkForKeyCollisions error #98779

Closed

rhu713 pushed a commit to rhu713/cockroach that referenced this pull request Mar 21, 2023

Revert "Merge pull request cockroachdb#97210 from rhu713/backport22.2…

c0bd21b

…-93997-94805-96313-96529-96302-96245" This reverts commit b9fbeca, reversing changes made to c390c0c.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-22.2: move alloc heavy Files field from manifest to SST, use slim manifest in backup restore #97210

release-22.2: move alloc heavy Files field from manifest to SST, use slim manifest in backup restore #97210

rhu713 commented Feb 15, 2023 •

edited

Loading

blathers-crl bot commented Feb 15, 2023 •

edited by rhu713

Loading

cockroach-teamcity commented Feb 15, 2023

adityamaru commented Feb 16, 2023

adityamaru commented Feb 18, 2023

release-22.2: move alloc heavy Files field from manifest to SST, use slim manifest in backup restore #97210

release-22.2: move alloc heavy Files field from manifest to SST, use slim manifest in backup restore #97210

Conversation

rhu713 commented Feb 15, 2023 • edited Loading

blathers-crl bot commented Feb 15, 2023 • edited by rhu713 Loading

cockroach-teamcity commented Feb 15, 2023

adityamaru commented Feb 16, 2023

adityamaru commented Feb 18, 2023

rhu713 commented Feb 15, 2023 •

edited

Loading

blathers-crl bot commented Feb 15, 2023 •

edited by rhu713

Loading