-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vreplication: performance & benchmarks (table copying) #7881
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
go/bytes2/buffer.go
Outdated
func (buf *Buffer) StringUnsafe() string { | ||
return *(*string)(unsafe.Pointer(&buf.bytes)) | ||
} | ||
|
||
func (buf *Buffer) Reset() { | ||
buf.bytes = buf.bytes[:0] | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised the linter didn't catch this. Exported methods need comments.
@@ -284,6 +284,36 @@ func (v Value) EncodeSQL(b BinWriter) { | |||
} | |||
} | |||
|
|||
// EncodeSQLStringBuilder is identical to EncodeSQL but it takes a strings.Builder | |||
// as its writer, so it can be inlined for performance. | |||
func (v Value) EncodeSQLStringBuilder(b *strings.Builder) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have two sets of duplicated code in this file that are crying out for generics.
I don't suppose it is worth generating them (since it is only two copies).
@@ -211,7 +207,7 @@ func (vc *VitessCluster) AddKeyspace(t *testing.T, cells []*Cell, ksName string, | |||
keyspace.VSchema = vschema | |||
for _, cell := range cells { | |||
if len(cell.Vtgates) == 0 { | |||
fmt.Println("Starting vtgate") | |||
t.Logf("Starting vtgate") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still get logs to console with this change?
I usually run go test -alsologtostderr -v
.
Noting that I'm going to review this, probably slowly 😛 |
b59e211
to
5df0ff8
Compare
Signed-off-by: Vicent Marti <[email protected]>
Signed-off-by: Vicent Marti <[email protected]>
Signed-off-by: Vicent Marti <[email protected]>
Signed-off-by: Vicent Marti <[email protected]>
Signed-off-by: Vicent Marti <[email protected]>
} | ||
defer conn.Close() | ||
|
||
query := fmt.Sprintf("LOAD DATA INFILE '%s' INTO TABLE `%s` FIELDS TERMINATED BY ',' ENCLOSED BY '\"'", tmpbulk.Name(), table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes you are importing a relation that matches exactly the target table. Just making sure: LOAD DATA INFILE
will not work this way if:
- the target table is a materialized view, and has more columns than original table, or:
- this is an Online DDL operation, and target table may look altogether different than source table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is supposed to be used only when loading data in the unit tests, so the author of the test must make sure that the CSV file it's writing actually fits in the database. I'm not sure how we could enforce that programatically.
Signed-off-by: Vicent Marti <[email protected]>
Description
These upcoming weeks I'm going to be working on
vreplication
performance. It's a pretty complex part of Vitess' codebase and it hasn't received a lot of performance tuning in the past so I'm hoping to get up to speed with the implementation and give it some perf attention.Let's take a look at the changes in this PR, commit by commit:
endtoend: improve vreplication tests
: Critically, there are currently no benchmarks forvreplication
, and it turns out, the system is complex enough (with interdependencies betweenvtctl
,vtgate
,vttablet
andmysqld
) that writing microbenchmarks for it is not feasible. To get a starting point for performance tuning instead, I've written some end-to-end performance tests, which give us benchmarking estimates and let us look at the performance profiles for all the moving parts in the system. The first test included here tackles the simplest (possibly?) and most performance sensitive part of replication: the initial table copying.For those following at home, I might as well explain a little bit some
vreplication
basics:A
vreplication
stream links twovttablet
instances by copying and keeping in sync one or more tables between the underlyingmysqld
instances of each tablet. To accomplish this, the stream is divided in three phases: first off, there is a table copying phase (what we're profiling and optimizing in this PR), which does bulk reads from the source tablet and sends them to the target for bulk inserting. Afterwards, there is a catchup phase, which catches up any row changes that have become outdated during the initial copy of the table. Once the stream has caught up and there's no significant replication lag between the two tablets, we switch to the final phase where we replicate rows in real time from the MySQL binary log.The table copying phase is, for many
vreplication
workflows, the most performance intensive part of the process. For instance, if you're usingvreplication
to re-shard a cluster, you'll want to copy the tables as fast as possible, and once the target has caught up, you want to switch traffic to the new sharding. Hence, you'd spend the vast majority of time doing a copy, and very little time actually streaming. Other uses, just as materializing views, make much more use of streaming, but streaming can often keep up with the changes in the source cluster, because most clusters are not write-heavy, so optimizing that doesn't feel as urgent to me. We'll look at the streaming in the following weeks.pprof: do not create profiles until we start profiling
: this is a small cleanup to the profiling handlers that we wrote forvtgate
andvttablet
. These profiling handlers don't start profiling in their respective processes until they receive an USR2 signal, but they were previously creating an on-disk profile as soon as the process started. Since our integration suite doesn't profile all the available tablets in the integration cluster, this resulted in several zero-sized profiles on disk for the tablets that were not being profiled.endtoend: allow bulk-loading databases
: another important cleanup for our endtoend suite. We need very large datasets in MySQL in order to gather realistic profiling benchmarks, and building these for each test run can be very expensive (expensive enough to make the profile-optimize-build cycle very painful). To speed up this operation, this commit improves Vitess'mysqlctl
helper so that spawnedmysqld
instances can have a configurablesecure-file-priv
path. This flag allows us to use MySQL'sLOAD DATA INFILE
to bulk-load millions of rows from CSV files on disk.vreplication: improve copy performance
: this is the first actual optimization from the CPU profiles of our copy test. It tackles table copy performance on the target side (i.e. the one receiving the rows to be inserted locally). The result is very good, it reduces CPU usage by 60% and memory allocations by 90%. Unfortunately, the effective performance improvement in wall time is not as large: between 15% and 20% faster runtime.Some thoughts on this: one of the issues that we've seen in large production clusters that use
vreplication
is that, since the replication streams connect twovttablet
s, the replication process is competing for CPU cycles with the actual query serving code, which also runs in thevttablet
s. From this point of view, the massive CPU reduction in this optimization is a very good thing, because it means more spare CPU cycles to continue serving MySQL queries from the tablet. Why does it translate into only a 15% reduction in runtime? If we look at a syscall blocking profile of the tablet, the answer is clear: the optimization of the query inserts is so comprehensive that it brings down the runtime of the table copying process all the way down to MySQL insert performance. The copying process cannot go any faster because MySQL doesn't ingest our rows any faster.Not all is lost, of course: there are still things we can do to make MySQL ingest the rows faster. I intend to look into that next week. Another side effect of the fact that the receiving
vttablet
is now bottlenecked in MySQL insert performance, is that are no gains to be obtained by optimizing the row generation in the sourcevttablet
-- in theory. In practice, the same issue we've just discussed also applies to the source tablet: the replication flow also competes for CPU cycles with the actual query load, so any optimizations we perform there, while not reducing the actual wall runtime of the replication process, will result in more query throughput for production servers, so I intend to tackle that next week too.For now, let's wrap this up by taking a look at the impact of this optimization on the flame graphs for the
vttablet
process:Before
After
We can see two clear columns in both profiles: Go runtime (on the left), and our own code (on the right). The profile of our own code is further divided in two clear columns: GRPC streaming reads (left) and query preparation for MySQL bulk inserts (right). The result optimizing the query preparation step is significant, and it shows in how the relative CPU usage of GRPC, which used to be roughly 25% of the runtime, now becomes 75% of the total runtime. Obviously, the actual CPU usage of GRPC is constant between the two profiles, so the fact that it now dominates our CPU profile means that the query preparation is basically using residual amounts of CPU. The next possible optimization here is Protobuf (de)serialization for the incoming rows, which I intend to do next week.
Related Issue(s)
Checklist
Deployment Notes
Impacted Areas in Vitess
Components that this PR will affect: