-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.1: gcp,s3,azure: make the storage client upload chunk size configurable #80947
Conversation
9625f2d
to
8c63305
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
friendly ping @dt @stevendanna, I think this should reduce some of the 503s we've been seeing |
This change adds a `cloudstorage.write_chunk_size` cluster setting that allows us to control the size of the chunks buffered by the cloud storage client when uploading a file to storage. The setting defaults to 8MiB. Prior to this change gcs used a 16MB buffer, s3 a 5MB buffer, and azure a 4MB buffer. A follow up change will add memory monitoring to each external storage writer to account for these buffered chunks during upload. This change was motivated by the fact that in google-cloud-storage SDK versions prior to v1.21.0 every chunk is given a hardcoded timeout of 32s to successfully upload to storage. This includes retries due to transient errors. If any chunk during a backup were to hit this timeout the entire backup would fail. We have additional work to do to make the job more resilient to such failures, but dropping the default chunk size might mean we see fewer chunks hit their timeouts. Release note: None
8c63305
to
147bd0a
Compare
@@ -137,7 +137,7 @@ func (s *azureStorage) Writer(ctx context.Context, basename string) (io.WriteClo | |||
defer sp.Finish() | |||
_, err := azblob.UploadStreamToBlockBlob( | |||
ctx, r, blob, azblob.UploadStreamToBlockBlobOptions{ | |||
BufferSize: 4 << 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should re-test azure (by hand I guess) with this before we backport. I have vague recollections of it having some hard coded/magic numbers on these chunks but I don't remember what they were, so I just want to confirm we're not breaking (untested) code in a stable release here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes this fell off my radar, running them now and merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran all the tests in pkg/cloud/azure
and they passed, merging!
Reminder: it has been 3 weeks please merge or close your backport! |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error getting backport branch release-pr-activity: unexpected status code: 404 Not Found Backport to branch pr-activity failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
blathers backport 21.2 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. Backport to branch 21.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Backport 1/1 commits from #80668 on behalf of @adityamaru.
/cc @cockroachdb/release
This change adds a
cloudstorage.write_chunk_size
cluster settingthat allows us to control the size of the chunks buffered by the
cloud storage client when uploading a file to storage. The setting defaults to
8MiB.
Prior to this change gcs used a 16MB buffer, s3 a 5MB buffer, and azure a 4MB
buffer. A follow up change will add memory monitoring to each external storage
writer to account for these buffered chunks during upload.
This change was motivated by the fact that in google-cloud-storage
SDK versions prior to v1.21.0 every chunk is given a hardcoded
timeout of 32s to successfully upload to storage. This includes retries
due to transient errors. If any chunk during a backup were to hit this
timeout the entire backup would fail. We have additional work to do
to make the job more resilient to such failures, but dropping the default
chunk size might mean we see fewer chunks hit their timeouts.
Release note: None
Release justification: low risk, high benefit change that reduces the chances of a backup failing because of chunk retry timeouts in the cloud storage sdk