-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/vt/mysqlctl: add configurable read buffer to builtin backups #12073
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
1b9d15d
to
7f53b0a
Compare
Signed-off-by: Max Englander <[email protected]>
7f53b0a
to
1abc241
Compare
@@ -63,6 +62,16 @@ var ( | |||
BuiltinBackupMysqldTimeout = 10 * time.Minute | |||
|
|||
builtinBackupProgress = 5 * time.Second | |||
|
|||
// Controls the size of blocks read from disk during backups. | |||
builtinBackupFileReadBufferSize = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set to zero so that this is opt-in. When set to zero the Golang hard-coded buffer size of 32*1024 is used.
|
||
// Controls the byte block size of writes to backupstorage during backups. | ||
// The backupstorage may be a physical file, network, or something else. | ||
builtinBackupStorageWriteBufferSize = 2 * 1024 * 1024 /* 2 MiB */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current behavior is that we use the same 2 MiB buffer when writing files to disk during restores or writing to backupstorage during backups.
Backupstorage may or may not go directly to disk, so I think it makes sense to decouple these buffers to they can be tuned independently.
I think it would be even better if builtinbackup didn't do any buffering when reading to or writing from backupstorage, and instead would be better to let each backupstorage engine decide how to handle read/write buffering, maybe with tunable flags.
Changes look good to me .. Few observations
|
@@ -128,6 +137,8 @@ func init() { | |||
func registerBuiltinBackupEngineFlags(fs *pflag.FlagSet) { | |||
fs.DurationVar(&BuiltinBackupMysqldTimeout, "builtinbackup_mysqld_timeout", BuiltinBackupMysqldTimeout, "how long to wait for mysqld to shutdown at the start of the backup.") | |||
fs.DurationVar(&builtinBackupProgress, "builtinbackup_progress", builtinBackupProgress, "how often to send progress updates when backing up large files.") | |||
fs.IntVar(&builtinBackupFileReadBufferSize, "builtinbackup-file-read-buffer-size", builtinBackupFileReadBufferSize, "read files from disk in blocks of this many bytes. Golang defaults are used when set to 0.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: should we make it UintVar instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to generate help text again ... Rest looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh, fixed
Signed-off-by: Max Englander <[email protected]>
…ltinbackup-file-write-buffer-size Signed-off-by: Max Englander <[email protected]>
Signed-off-by: Max Englander <[email protected]>
That's definitely possible. This PR does not change current behavior, so in my opinion it would be fine to merge this first, but wait until after metrics PR is in prod before turning on these flags in prod. I leave it to you to decide :) |
Signed-off-by: Max Englander <[email protected]>
Signed-off-by: Max Englander <[email protected]>
Looks good to me .. there is failure in vreplication_across_db_versions, which is because of #12253. |
@@ -66,6 +66,7 @@ var ( | |||
|
|||
const ( | |||
streamModeTar = "tar" | |||
writerBufferSize = 2 * 1024 * 1024 /*2 MiB*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this moved from builtin to this file? It's not actually used by xtrabackupengine AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used!
vitess/go/vt/mysqlctl/xtrabackupengine.go
Line 327 in da69672
buffer := bufio.NewWriterSize(file, writerBufferSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah it was already being used. And it is no longer used in builtin. 👍
Co-authored-by: Deepthi Sigireddi <[email protected]> Signed-off-by: Max Englander <[email protected]>
Signed-off-by: Max Englander <[email protected]>
c64e4cf
to
0db048d
Compare
go/flags/endtoend/vttestserver.txt
Outdated
--builtinbackup-file-read-buffer-size uint read files from disk in blocks of this many bytes. Golang defaults are used when set to 0. | ||
--builtinbackup-file-write-buffer-size uint write files to disk in blocks of this many bytes. (default 2097152) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These have nothing to do with block device block sizes, right? If so, IMO it would be a little more clear to say chunks instead of blocks in this context. Or it may be even more clear/precise to say that we use a buffer of this size each time that we read/write backup data from disk?
Description
The builtin backup engine reads files from disk, optionally compresses them, and sends them to backupstorage.
Currently, the backup engine uses the
*os.File
returned byos.Open
to read files from disk.vitess/go/vt/mysqlctl/builtinbackupengine.go
Line 166 in 7fc1b48
It then passes this to
io.Copy
.vitess/go/vt/mysqlctl/builtinbackupengine.go
Line 718 in 7fc1b48
io.Copy
reads in blocks of32*1024
bytesThis is not optimal in all environments. In some environments, it's better to read bigger blocks less frequently.
Changes
This PR:
Evidence
Coming up with a benchmark to prove that this is useful is difficult, but anecdotally I've seen a modest performance improvement from increasing the size of the read buffer. I'm not sure if the performance improvement is coming from using the hardware more effectively, or if there are more subtle interactions happening between all the moving pieces of
mysqlctl.Backup
.My test environment looks like this:
compression-engine=external
,external-compressor="zstd -c -T4 -1"
.Running vtbackup, the baseline backup time is ~215s.
When I add a read buffer of 2MiB, that goes down to ~175s (~18% improvement).
If we feel it's needed, I can try to put together more complete & reproducible evidence.
Related Issue(s)
Fixes #12069
Checklist
Deployment Notes