perf: optimizing write latency using independent IO queues replace of libaio #633

foreverneverer · 2020-09-27T11:01:00Z

This PR is based on #569, thanks for @neverchanje offering an initial implementation of the new async-io.

The original async-io is based on Linux AIO which may cause some bottlenecks especially in Learning and Compaction situations which will slow down the write-path. #568 has changed the callback task pool of LibAIO to optimize the latency during Compaction, but we still find latency spikes occasionally occur during Compaction, and it generally makes the cluster unavailable during Learning (the replica migration process typically happen during scaling-in-out nodes).

To avoid the influence of different AIO tasks sharing the same LibAIO queue, this PR removes the LibAIO module and introduces a new async-IO implementation based on rdsn task-queue. It handles different IO tasks in separate task queues, isolates low & high priority tasks.

In the latest design, the Learning IO task is assigned into THREAD_POOL_DEFAULT queue, the private-log IO task is assigned to THREAD_POOL_REPLICATION_LONG queue, shared-log IO task is assigned to THREAD_POOL_SLOG queue. With the isolation of different IO queues, every io task can be executed efficiently.

In addition, for the Learning request logic may be assigned THREAD_POOL_REPLICATION when after end_get_file_size(the end_get_file_size is assigned into THREAD_POOL_REPLICATION, because for the rpc ack, the thread pool is equal withe current pool, see rpc_request_task), which may block the write operation especially using NFS-RateLimiter set low learning rate. Since the change is little, I fix it in this PR:

nfs_client_impl::end_get_file_size(){
...
- continue_copy();
+ tasking::enqueue(LPC_NFS_COPY_FILE, nullptr, [this]() { continue_copy(); }, 0);
}

And LPC_NFS_COPY_FILE is assigned into THREAD_POOL_DEFAULT

Experiments

Here I tested cases during Learning and Compaction to see the effect of optimization. The pegasus configuration is 2 meta-server and 5 replica-server.

Case 1: Compaction-30thread*3client, load, 1KB length

#568 has shown the effect of most cases. This test shows the average result of multiple tests. In this test, I execute three times using YCSB.

io/latency	min	average	p95	p99	p999	p9999	max
LibAIO	340	1578	3157	6423	11119	325897	1291624
NewAIO	335	1558	3139	6343	10175	18079	177663

We can find them achieving almost the same results below P9999. However, at P9999 and MAX results, NewAIO provides a better result.

Case 2: Learning of adding node-15thread*3client, write:read=3:1, 1KB length, data=16GB * 32 Partitions

Case 3: Learning of offline node-15thread*3client, write:read=3:1, 1KB length, data=16GB * 32 Partitions

From the above results, we can confirm that the new async-io implementation almost avoids the spikes during adding node, and greatly reduced the influence while offlining node.

Notice that not all the tests can have same exact latency value, but the conclusions of multiple tests are consistent.

src/aio/native_linux_aio_provider.cpp

…waio-with-tracer

src/aio/aio_provider.h

src/nfs/nfs_client_impl.cpp

src/aio/disk_engine.h

… libaio (#633)

neverchanje and others added 30 commits July 17, 2020 14:32

refactor: implement thread-pool-based asynchrounous IO

02c608d

merge update

24cedc7

add tracer

4b31613

update tracer

18ad5e9

update tracer

caa7535

update tracer

bb14fe4

update tracer

b9776e5

update tracer

147b595

update tracer

fd7cf38

merge

298942b

add comment

1bc4dd7

add comment

31c30f9

Merge branch 'fix-tracer' into newaio-with-tracer

9db31ba

add comment

32fcef1

add comment

c1f33f1

add comment

5656d64

add comment

fb55339

add comment

676bd85

add comment

aabab6a

add comment

6f6d221

add comment

7d32244

add perfcounter

9e7d3fc

add perfcounter

1718381

add perfcounter

eec5dcc

add max

6297287

update

e384c78

add debug info

c1fe66c

add debug info

b804b72

add debug info

c88a1be

add debug info

026301f

Merge branch 'master' into newaio-with-tracer

4caad55

foreverneverer marked this pull request as ready for review October 12, 2020 06:57

hycdong previously approved these changes Oct 12, 2020

View reviewed changes

hycdong and others added 2 commits October 13, 2020 17:00

Merge branch 'master' into newaio-with-tracer

e7977ac

Merge branch 'master' into newaio-with-tracer

e46d657

levy5307 reviewed Oct 15, 2020

View reviewed changes

src/aio/native_linux_aio_provider.cpp Outdated Show resolved Hide resolved

levy5307 reviewed Oct 15, 2020

View reviewed changes

src/aio/native_linux_aio_provider.cpp Outdated Show resolved Hide resolved

foreverneverer added 2 commits October 19, 2020 10:22

review

d0bc147

Merge branch 'newaio-with-tracer' of github.com:Shuo-Jia/rdsn into ne…

1e18d05

…waio-with-tracer

foreverneverer dismissed hycdong’s stale review via 1e18d05 October 19, 2020 02:23

levy5307 reviewed Oct 19, 2020

View reviewed changes

src/aio/aio_provider.h Outdated Show resolved Hide resolved

neverchanje reviewed Oct 19, 2020

View reviewed changes

src/nfs/nfs_client_impl.cpp Show resolved Hide resolved

review

5cf7373

neverchanje previously approved these changes Oct 19, 2020

View reviewed changes

levy5307 reviewed Oct 19, 2020

View reviewed changes

src/aio/disk_engine.h Show resolved Hide resolved

foreverneverer dismissed neverchanje’s stale review via d6955fd October 19, 2020 10:06

levy5307 previously approved these changes Oct 19, 2020

View reviewed changes

foreverneverer dismissed levy5307’s stale review via 622e7a7 October 19, 2020 10:16

foreverneverer force-pushed the newaio-with-tracer branch from 622e7a7 to 5cf7373 Compare October 19, 2020 10:40

neverchanje approved these changes Oct 19, 2020

View reviewed changes

levy5307 approved these changes Oct 19, 2020

View reviewed changes

levy5307 merged commit a975bd6 into XiaoMi:master Oct 19, 2020

neverchanje mentioned this pull request Oct 20, 2020

perf: implement thread-pool-based asynchrounous IO #569

Closed

foreverneverer mentioned this pull request Oct 22, 2020

fix: plog aio tasks are assigned replica long thread pool will cause close app block #649

Merged

foreverneverer added a commit that referenced this pull request Nov 13, 2020

perf: optimizing write latency using independent IO queues replace of…

6b2ebe7

… libaio (#633)

foreverneverer mentioned this pull request Jan 8, 2021

refactor: delete find package libaio dependency when build #723

Merged

neverchanje mentioned this pull request Mar 1, 2021

Release 2.2.0 apache/incubator-pegasus#696

Closed

neverchanje added the 2.1.0 label Mar 2, 2021

foreverneverer mentioned this pull request Aug 13, 2021

Performance: optimize read&&write latency performance apache/incubator-pegasus#801

Open

8 tasks

foreverneverer mentioned this pull request Nov 19, 2021

feat: add rate limit for learning and support remote-command #461

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimizing write latency using independent IO queues replace of libaio #633

perf: optimizing write latency using independent IO queues replace of libaio #633

foreverneverer commented Sep 27, 2020 •

edited

Loading

perf: optimizing write latency using independent IO queues replace of libaio #633

perf: optimizing write latency using independent IO queues replace of libaio #633

Conversation

foreverneverer commented Sep 27, 2020 • edited Loading

Experiments

foreverneverer commented Sep 27, 2020 •

edited

Loading