Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: parallelize loading of manifests from External Storage #87311

Merged
merged 1 commit into from
Sep 7, 2022

Conversation

adityamaru
Copy link
Contributor

This change is a targetted change to parallelize the loading
of backup manifests for each incremental layer of a backup.
This method is shared by both restore as well as SHOW BACKUP.

Fixes: #87183

Release note: None

Release justification: low risk performance improvement required
for making SHOW BACKUP in the presence of many incremental layers
more performant

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru adityamaru force-pushed the parallelize-show-backups branch from 515b7d3 to 2f53b1f Compare September 2, 2022 14:24
@adityamaru
Copy link
Contributor Author

adityamaru commented Sep 2, 2022

Comparing two traces of a SHOW BACKUP on a backup chain that contains s a 1.80TB (replicated) backup consisting of 50 incremental backup layers taken every 15 minutes. 8000 warehouses were imported and then a workload of 1000 warehouses was run against the cluster while the incremental backups were being taken.

Old trace:

This is a summary of where we are spending time in this operation. Each row is an aggregation of all operations with that operation name across the execution of the show backup:

     0.000ms      0.000ms    === operation:traced statement _verbose:1 node:1 client:[::1]:59791 user:root
     0.000ms      0.000ms    [flow: {count: 1, duration 3m6s}]
     0.000ms      0.000ms    [batch flow coordinator: {count: 1, duration 3m6s}]
     0.000ms      0.000ms    [backupccl.showBackupPlanHook: {count: 1, duration 3m6s}]
     0.000ms      0.000ms    [SHOW BACKUP: {count: 1, duration 3m6s}]
     0.000ms      0.000ms    [backupdest.FindPriorBackups: {count: 3, duration 2m46s}]
     0.000ms      0.000ms    [gcs.List: {count: 3, duration 2m46s}]
     0.000ms      0.000ms    [backupdest.ResolveBackupManifests: {count: 1, duration 1m45s}]
     0.000ms      0.000ms    [backupdest.ResolveIncrementalsBackupLocation: {count: 1, duration 1m20s}]
     0.000ms      0.000ms    [backupdest.backupsFromLocation: {count: 2, duration 1m20s}]
     0.000ms      0.000ms    [backupinfo.ReadBackupManifest: {count: 92, duration 19s}]
     0.000ms      0.000ms    [gcs.ReadFileAt: {count: 184, duration 18.7s}]
     0.000ms      0.000ms    [backupdest.LayerLoop: {count: 1, duration 18.5s}]
     0.000ms      0.000ms    [backupccl.showBackup: {count: 1, duration 507ms}]
     0.000ms      0.000ms    [backupinfo.ReadBackupManifestFromStore: {count: 1, duration 470ms}]
     0.000ms      0.000ms    [backupccl.backupShowerDefault.fn: {count: 1, duration 386ms}]
     0.000ms      0.000ms    [backupinfo.readManifest: {count: 92, duration 258ms}]
     0.000ms      0.000ms    [backupinfo.GetLocalityInfo: {count: 92, duration 388µs}]
     0.000ms      0.000ms    [optimizer: {count: 1, duration 286µs}]
     0.000ms      0.000ms    [commit sql txn: {count: 1, duration 10µs}]

Taking the hierarchy of method calls into account we can infer that the majority of the time is spent in backupdest.ResolveBackupManifests and backupdest.ResolveIncrementalsBackupLocation . So if we drill down into those we see:

 1.870ms      1.075ms                    === operation:backupdest.ResolveIncrementalsBackupLocation _verbose:1 node:1 client:[::1]:59791 user:root
     1.870ms      0.000ms                    [backupdest.backupsFromLocation: {count: 2, duration 1m20s}]
     1.870ms      0.000ms                    [backupdest.FindPriorBackups: {count: 2, duration 1m20s}]
     1.870ms      0.000ms                    [gcs.List: {count: 2, duration 1m20s}]

and

80450.060ms    268.367ms                    === operation:backupdest.ResolveBackupManifests _verbose:1 node:1 client:[::1]:59791 user:root
 80450.060ms      0.000ms                    [backupdest.FindPriorBackups: {count: 1, duration 1m26s}]
 80450.060ms      0.000ms                    [gcs.List: {count: 1, duration 1m26s}]
 80450.060ms      0.000ms                    [backupinfo.ReadBackupManifest: {count: 92, duration 19s}]
 80450.060ms      0.000ms                    [gcs.ReadFileAt: {count: 184, duration 18.7s}]
 80450.060ms      0.000ms                    [backupdest.LayerLoop: {count: 1, duration 18.5s}]
 80450.060ms      0.000ms                    [backupinfo.ReadBackupManifestFromStore: {count: 1, duration 470ms}]
 80450.060ms      0.000ms                    [backupinfo.readManifest: {count: 92, duration 258ms}]
 80450.060ms      0.000ms                    [backupinfo.GetLocalityInfo: {count: 92, duration 388µs}]

The majority of the time in both methods is spent in the call to gcs.List which requires investigation as to whether we can optimize something here, but this diff is concerned with the tracing span called LayerLoop which wraps the logic that reaches out to External Storage for every incremental layer and reads the manifest.

New trace:

In the new trace where we have changed the LayerLoop logic to load the manifests concurrently we see it take 649ms instead of 18s:

76883.700ms    360.265ms                    === operation:backupdest.ResolveBackupManifests _verbose:1 node:1 client:[::1]:60360 user:root
 76883.700ms      0.000ms                    [backupdest.FindPriorBackups: {count: 1, duration 1m25s}]
 76883.700ms      0.000ms                    [gcs.List: {count: 1, duration 1m25s}]
 76883.700ms      0.000ms                    [backupinfo.ReadBackupManifestFromStore: {count: 92, duration 50.6s}]
 76883.700ms      0.000ms                    [backupinfo.ReadBackupManifest: {count: 92, duration 50.6s}]
 76883.700ms      0.000ms                    [gcs.ReadFileAt: {count: 184, duration 50.5s}]
 76883.700ms      0.000ms                    [backupdest.LayerLoop: {count: 1, duration 649ms}]
 76883.700ms      0.000ms                    [backupinfo.FetchPreviousBackups: {count: 1, duration 648ms}]
 76883.700ms      0.000ms                    [backupinfo.readManifest: {count: 92, duration 122ms}]
 76883.700ms      0.000ms                    [backupinfo.GetLocalityInfo: {count: 92, duration 39µs}]
 76883.761ms      0.061ms                        === operation:backupinfo.ReadBackupManifestFromStore _ve

The total time goes down for this SHOW goes down by 18s. This is far from the dominating cost of the list calls, but its still worth doing.

@adityamaru adityamaru marked this pull request as ready for review September 2, 2022 14:37
@adityamaru adityamaru requested a review from a team September 2, 2022 14:37
@adityamaru adityamaru requested a review from a team as a code owner September 2, 2022 14:37
@adityamaru adityamaru requested review from msbutler, stevendanna, benbardin and dasrirez and removed request for msbutler September 2, 2022 14:37
@adityamaru
Copy link
Contributor Author

For bulk folks the context for this change is https://cockroachlabs.slack.com/archives/C03RU3US2H1/p1661190445763609

pkg/ccl/backupccl/restore_planning.go Outdated Show resolved Hide resolved
@adityamaru
Copy link
Contributor Author

Note, List taking so long is a result of the fixture I ran it on being old, and not having some of the delimiter optimizations we checked in to solve this exact problem. Stay tuned for the updated traces once I've regenerated the fixtures.

@joshimhoff
Copy link
Collaborator

joshimhoff commented Sep 6, 2022

Yay to this PR!

@adityamaru, can we check what this PR does to the ~2s SHOW BACKUP runtime that @dasrirez found originally? Darius tells me he setup a local repro that queries some GCS buckets he set up already. Darius, can you send Aditya the SQL command you run on your local cluster? You mentioned that Aditya already has access to credentials for the bucket you set up.

Aditya, I'm sure the testing you've done here is a better stress test of SHOW BACKUP in general, but OTOH the ~2s SHOW BACKUP runtime being too high is the thing that could actually delay the launch of backupper deprecation, so good to know where we stand there.

@adityamaru
Copy link
Contributor Author

Aditya, I'm sure the testing you've done here is a better stress test of SHOW BACKUP in general, but OTOH the ~2s SHOW BACKUP runtime being too high is the thing that could actually delay the launch of backupper deprecation, so good to know where we stand there.

Definitely makes sense to run on what we actually want to improve. @dasrirez if you want to take this patch for a spin and grab a trace using EXPLAIN ANALYZE (DEBUG) <SHOW BACKUP query> that'd be helpful. Alternatively, you could DM me what you've been running so that we're testing on the same corpus. I've been using a pre-generated fixture we run our roachtests against that has 1.7Tb replicated data backed up over a chain of 50 incremental backups.

@dasrirez
Copy link
Collaborator

dasrirez commented Sep 7, 2022

This is wonderful! I ran the same simple test with 1 full + 24 incremental backups taken from a fresh cluster and the difference is absurd.

  • Without patch
    • 1 incremental: ~0.88s
    • 24 incrementals: ~4.5s
  • With patch
    • 1 incremental: ~1s
    • 24 incrementals: ~1.4s

Note that I issued the queries from a local cluster (avg of 8 trials) and the bucket lookup latency is much higher than when these queries are issued on the cloud cluster. On the cloud cluster, a backup with a single incremental takes 0.3s to query on average, which is why the backup with 24 incrementals reported 2s before without the patch compared to the 4.5s I got from my local cluster.

@adityamaru I will add the stmt bundle for the 24 incremental query w/ the patch in the SendSafely workspace since it contains some creds!

@joshimhoff
Copy link
Collaborator

Yay! And such fun DR + SRE collaboration too.

@adityamaru
Copy link
Contributor Author

adityamaru commented Sep 7, 2022

Thanks for the trace, I think the optimization is working as expected:

In the trace if we start at the top-level SHOW BACKUP span:

[SHOW BACKUP: {count: 1, duration 1.4s}]

Drilling down:

     0.884ms      0.000ms                [backupdest.ResolveIncrementalsBackupLocation: {count: 1, duration 254ms}]
     0.884ms      0.000ms                [backupdest.ResolveBackupManifests: {count: 1, duration 1s}]

Largely accounts for the 1.4s. So lets look at each:

First backupdest.ResolveIncrementalsBackupLocation, we can see all the time is spent in gcs.List. This is expected and so I don't think we can do much here to speed this up further.

     1.110ms      0.000ms                    [backupdest.backupsFromLocation: {count: 1, duration 254ms}]
     1.110ms      0.000ms                    [backupdest.FindPriorBackups: {count: 1, duration 254ms}]
     1.110ms      0.000ms                    [gcs.List: {count: 1, duration 254ms}]

Now in backupdest.ResolveBackupManifests the following calls largely account for the 1s:

   255.426ms      0.000ms                    [backupinfo.FetchPreviousBackups: {count: 1, duration 452ms}]
   255.426ms      0.000ms                    [backupdest.FindPriorBackups: {count: 1, duration 260ms}]
   255.426ms      0.000ms                    [backupinfo.GetLocalityInfo: {count: 25, duration 27µs}]

The most interesting one pertaining to this optimization is FetchPreviousBackups:

   816.860ms      0.000ms                        [backupinfo.ReadBackupManifestFromStore: {count: 24, duration 7.9s}]
   816.860ms      0.000ms                        [backupinfo.ReadBackupManifest: {count: 24, duration 7.9s}]
   816.860ms      0.000ms                        [gcs.ReadFileAt: {count: 48, duration 7.9s}]
   816.860ms      0.000ms                        [backupinfo.readManifest: {count: 24, duration 14ms}]

Since the reading of manifests happens concurrently the read path should take roughly 7900ms/24 = 330ms which accounts for most of the 452ms we see above.

@adityamaru adityamaru force-pushed the parallelize-show-backups branch from 2f53b1f to e482876 Compare September 7, 2022 18:12
This change is a targetted change to parallelize the loading
of backup manifests for each incremental layer of a backup.
This method is shared by both restore as well as `SHOW BACKUP`.

Release note: None

Release justification: low risk performance improvement required
for making `SHOW BACKUP` in the presence of many incremental layers
more performant
@adityamaru adityamaru force-pushed the parallelize-show-backups branch from e482876 to bd36bc5 Compare September 7, 2022 18:20
@adityamaru
Copy link
Contributor Author

The Bazel failures in testrace are being investigated here - https://cockroachlabs.slack.com/archives/C023S0V4YEB/p1662495624819929?thread_ts=1662488784.308579&cid=C023S0V4YEB. The failure in Bazel Extended CI is a flake in TestWorkload and the failure in GithubCI is a js check generated code error. All these are unrelated to my diff 😢

bors r=benbardin

@craig
Copy link
Contributor

craig bot commented Sep 7, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Sep 7, 2022

Build succeeded:

@craig craig bot merged commit e39111b into cockroachdb:master Sep 7, 2022
@adityamaru
Copy link
Contributor Author

blathers backport release-22.2

adityamaru added a commit to adityamaru/cockroach that referenced this pull request Nov 15, 2022
In cockroachdb#87311 we refactored `ResolveBackupManifests` to concurrently
load manifests from base and incremental layers by calling
`FetchPreviousBackups`. The mistake in that diff was to
pass in only the incremental layers of the backup to `FetchPreviousBackup`
instead of passing in the base backup + incremental layers.
This was silently passing all tests because in `ResolveBackupManifests`
we explicitly setup the base backup layer in all the relevant slices before
processing each incremental layer. The one case this was not okay in
was encrypted backups. `FetchPreviousBackups` apart from doing the obvious would
also read the encryption options from the base backups before reading the
manifests from each layer. Now, because we stopped sending the base backup
in the input slice, this step would fail since the method would go to the
first incremental backup (instead of the base backup) and attempt to
read the ENCRYPTION_INFO file. This file is only ever written to the base
backup and so a `SHOW BACKUP` or a `RESTORE` of an encrypted backup would
fail with a file not found error.

In this diff we:

1) Fix `ResolveBackupManifests` by passing in the base backup + incremental
backups to `FetchPreviousBackups`.

2) Make it the callers responsibility to pass in the fully hydrated encryption
options before calling `FetchPreviousBackups` so that the method is *only* fetching
backup manifests.

Fixes: cockroachdb#91886

Release note (bug fix): fixes a bug that would cause `SHOW BACKUP` and
`RESTORE` of encrypted incremental backups to fail
adityamaru added a commit to adityamaru/cockroach that referenced this pull request Nov 15, 2022
In cockroachdb#87311 we refactored `ResolveBackupManifests` to concurrently
load manifests from base and incremental layers by calling
`FetchPreviousBackups`. The mistake in that diff was to
pass in only the incremental layers of the backup to `FetchPreviousBackup`
instead of passing in the base backup + incremental layers.
This was silently passing all tests because in `ResolveBackupManifests`
we explicitly setup the base backup layer in all the relevant slices before
processing each incremental layer. The one case this was not okay in
was encrypted backups. `FetchPreviousBackups` apart from doing the obvious would
also read the encryption options from the base backups before reading the
manifests from each layer. Now, because we stopped sending the base backup
in the input slice, this step would fail since the method would go to the
first incremental backup (instead of the base backup) and attempt to
read the ENCRYPTION_INFO file. This file is only ever written to the base
backup and so a `SHOW BACKUP` or a `RESTORE` of an encrypted backup would
fail with a file not found error.

In this diff we:

1) Fix `ResolveBackupManifests` by passing in the base backup + incremental
backups to `FetchPreviousBackups`.

2) Make it the callers responsibility to pass in the fully hydrated encryption
options before calling `FetchPreviousBackups` so that the method is *only* fetching
backup manifests.

Fixes: cockroachdb#91886

Release note (bug fix): fixes a bug that would cause `SHOW BACKUP` and
`RESTORE` of encrypted incremental backups to fail
blathers-crl bot pushed a commit that referenced this pull request Nov 15, 2022
In #87311 we refactored `ResolveBackupManifests` to concurrently
load manifests from base and incremental layers by calling
`FetchPreviousBackups`. The mistake in that diff was to
pass in only the incremental layers of the backup to `FetchPreviousBackup`
instead of passing in the base backup + incremental layers.
This was silently passing all tests because in `ResolveBackupManifests`
we explicitly setup the base backup layer in all the relevant slices before
processing each incremental layer. The one case this was not okay in
was encrypted backups. `FetchPreviousBackups` apart from doing the obvious would
also read the encryption options from the base backups before reading the
manifests from each layer. Now, because we stopped sending the base backup
in the input slice, this step would fail since the method would go to the
first incremental backup (instead of the base backup) and attempt to
read the ENCRYPTION_INFO file. This file is only ever written to the base
backup and so a `SHOW BACKUP` or a `RESTORE` of an encrypted backup would
fail with a file not found error.

In this diff we:

1) Fix `ResolveBackupManifests` by passing in the base backup + incremental
backups to `FetchPreviousBackups`.

2) Make it the callers responsibility to pass in the fully hydrated encryption
options before calling `FetchPreviousBackups` so that the method is *only* fetching
backup manifests.

Fixes: #91886

Release note (bug fix): fixes a bug that would cause `SHOW BACKUP` and
`RESTORE` of encrypted incremental backups to fail
blathers-crl bot pushed a commit that referenced this pull request Nov 15, 2022
In #87311 we refactored `ResolveBackupManifests` to concurrently
load manifests from base and incremental layers by calling
`FetchPreviousBackups`. The mistake in that diff was to
pass in only the incremental layers of the backup to `FetchPreviousBackup`
instead of passing in the base backup + incremental layers.
This was silently passing all tests because in `ResolveBackupManifests`
we explicitly setup the base backup layer in all the relevant slices before
processing each incremental layer. The one case this was not okay in
was encrypted backups. `FetchPreviousBackups` apart from doing the obvious would
also read the encryption options from the base backups before reading the
manifests from each layer. Now, because we stopped sending the base backup
in the input slice, this step would fail since the method would go to the
first incremental backup (instead of the base backup) and attempt to
read the ENCRYPTION_INFO file. This file is only ever written to the base
backup and so a `SHOW BACKUP` or a `RESTORE` of an encrypted backup would
fail with a file not found error.

In this diff we:

1) Fix `ResolveBackupManifests` by passing in the base backup + incremental
backups to `FetchPreviousBackups`.

2) Make it the callers responsibility to pass in the fully hydrated encryption
options before calling `FetchPreviousBackups` so that the method is *only* fetching
backup manifests.

Fixes: #91886

Release note (bug fix): fixes a bug that would cause `SHOW BACKUP` and
`RESTORE` of encrypted incremental backups to fail
craig bot pushed a commit that referenced this pull request Nov 15, 2022
91911: backupccl: fix bug in resolving encrypted backup manifests r=stevendanna a=adityamaru

In #87311 we refactored `ResolveBackupManifests` to concurrently load manifests from base and incremental layers by calling `FetchPreviousBackups`. The mistake in that diff was to pass in only the incremental layers of the backup to `FetchPreviousBackup` instead of passing in the base backup + incremental layers. This was silently passing all tests because in `ResolveBackupManifests` we explicitly setup the base backup layer in all the relevant slices before processing each incremental layer. The one case this was not okay in was encrypted backups. `FetchPreviousBackups` apart from doing the obvious would also read the encryption options from the base backups before reading the manifests from each layer. Now, because we stopped sending the base backup in the input slice, this step would fail since the method would go to the first incremental backup (instead of the base backup) and attempt to read the ENCRYPTION_INFO file. This file is only ever written to the base backup and so a `SHOW BACKUP` or a `RESTORE` of an encrypted backup would fail with a file not found error.

In this diff we:

1) Fix `ResolveBackupManifests` by passing in the base backup + incremental backups to `FetchPreviousBackups`.

2) Make it the callers responsibility to pass in the fully hydrated encryption options before calling `FetchPreviousBackups` so that the method is *only* fetching backup manifests.

Fixes: #91886

Release note (bug fix): fixes a bug that would cause `SHOW BACKUP` and `RESTORE` of encrypted incremental backups to fail

Co-authored-by: adityamaru <[email protected]>
@adityamaru adityamaru deleted the parallelize-show-backups branch December 6, 2022 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backupccl: SHOW BACKUP should parallelize reading incremental layers
5 participants