Allow to change the worker thread number in-flight #1855

git-hulk · 2023-10-24T13:13:52Z

This closes #1802

Scaling up is as simple as starting new threads and adding them to the worker pool.
But scaling down is a bit complex since we need to migrate existing connections from
the to-be-removed workers to survived workers. This process is NOT so easy for the sake
of the blocking commands that will store its context(includes fd and owner pointer) in
different global variables and we need to find out and remove them. As far as I know,
it contains the below commands:

All LIST blocking command
Subscribe/PSubscribe
XREAD

To make it simple, we would like to close those connections if the last command is
a blocking command. And we can improve this if any users complain about this.

guojidan · 2023-10-25T08:52:20Z

nice, long connect will not disconnect when change workers, I use redis-benchmark continuous write

git-hulk · 2023-10-25T09:10:51Z

nice, long connect will not disconnect when change workers, I use redis-benchmark continuous write

Thank you! I'm working on test cases.

mapleFU · 2023-10-26T03:55:58Z

Should we prevent from setting the thread-cnt to a dangerous value( .e.g 0)?

git-hulk · 2023-10-26T10:19:03Z

Should we prevent from setting the thread-cnt to a dangerous value( .e.g 0)?

@mapleFU Yes, IntField will check the value range before setting.

{"workers", true, new IntField(&workers, 8, 1, 256)},

src/server/redis_connection.cc

git-hulk · 2023-10-27T14:04:06Z

src/config/config.cc

+          {"workers",
+           [](Server *srv, const std::string &k, const std::string &v) -> Status {
+             if (!srv) return Status::OK();
+             srv->AdjustWorkerThreads();
+             return Status::OK();
+           }},


Only those lines are newly added, rest is formatted by clang-format

git-hulk · 2023-10-27T14:06:33Z

src/server/server.cc

@@ -417,8 +419,6 @@ void Server::SubscribeChannel(const std::string &channel, redis::Connection *con
  std::lock_guard<std::mutex> guard(pubsub_channels_mu_);

  auto conn_ctx = ConnContext(conn->Owner(), conn->GetFD());
-  conn_ctxs_[conn_ctx] = true;


No where is using conn_ctxs_, so delete it.

torwig

A few comments, but the rest LGTM. @git-hulk Good job!

tests/gocase/unit/config/config_test.go

src/server/worker.cc

torwig

LGTM.

src/server/server.cc

mapleFU · 2023-10-29T09:41:56Z

I'm a bit confused about the ctor ordering here:

Server::~Server calls cleanupExitedWorkerThreads()
cron calls cleanupExitedWorkerThreads()

However, when Join, would the exited worker thread be cleaned?

git-hulk · 2023-10-29T11:51:28Z

I'm a bit confused about the ctor ordering here:

Server::~Server calls cleanupExitedWorkerThreads()

cron calls cleanupExitedWorkerThreads()

However, when Join, would the exited worker thread be cleaned?

The cron thread periodic call cleanupExitedWorkerThreads() but it can't promise it will call cleanupExitedWorkerThreads() before exiting the cron thread. So for the server destructor, it MUST check again to prevent exceptions.

src/server/server.cc

mapleFU · 2023-10-29T11:59:17Z

Got it, so finally resource is handled by dtor.

mapleFU

LGTM. Thanks

git-hulk · 2023-10-30T01:13:30Z

Thanks all, merging..

#1855 introduced the data race even only migrating the non-blocking connections. For example: T1: C0(connection) is running the command Config::Set on W0(worker) and C1 is running another command on W1. C0 got the exclusive lock for executing commands and C1 will wait for the lock inside ExecuteCommands. T2: C0 migrates C1 to W0 after reducing the number of worker threads, but the ExecuteCommands function will continue executing after migrating. So both W0 and W1 will access the C1 at the same time. To avoid this, I simply don't migrate the connection if it has any running command and delay to shutdown workers, so that old connections have a chance to finish the running command.

git-hulk force-pushed the feature/dynamic-workers branch from a6d8ee6 to c30ce43 Compare October 24, 2023 13:27

PragmaTwice reviewed Oct 27, 2023

View reviewed changes

src/server/redis_connection.cc Outdated Show resolved Hide resolved

git-hulk commented Oct 27, 2023

View reviewed changes

git-hulk requested review from PragmaTwice and torwig October 27, 2023 14:07

git-hulk force-pushed the feature/dynamic-workers branch from aa99ecc to d99e14e Compare October 27, 2023 14:12

git-hulk marked this pull request as ready for review October 27, 2023 14:12

git-hulk marked this pull request as draft October 27, 2023 15:30

git-hulk marked this pull request as ready for review October 28, 2023 07:46

torwig reviewed Oct 28, 2023

View reviewed changes

tests/gocase/unit/config/config_test.go Outdated Show resolved Hide resolved

src/server/worker.cc Show resolved Hide resolved

git-hulk added 14 commits October 29, 2023 08:55

Allow to change the worker thread number in-flight

0bf0207

Add go test cases

1e2a5fd

Use tbb::concurrent_queue to prevent data race

18801cd

Try to cleanup threads before destory server

04da97a

Close all block connections when decreasing worker threads

4deb904

Fix tidy error

6566824

Fix segment fault

cde651f

Fix wrong blocking state

25710fc

Fix wrong class cast

7bcf301

Fix data race in connection

cb350aa

Add subscribe and psubscribe as the blocking command

169f2ad

Fix

da48790

Reorder add connection

3d2086e

Fix conflicts

cb917e6

git-hulk force-pushed the feature/dynamic-workers branch from 8297bc4 to cb917e6 Compare October 29, 2023 00:56

git-hulk requested a review from torwig October 29, 2023 07:42

torwig previously approved these changes Oct 29, 2023

View reviewed changes

mapleFU reviewed Oct 29, 2023

View reviewed changes

src/server/server.cc Show resolved Hide resolved

src/server/server.cc Show resolved Hide resolved

src/server/server.cc Show resolved Hide resolved

src/server/server.cc Show resolved Hide resolved

git-hulk commented Oct 29, 2023

View reviewed changes

src/server/server.cc Show resolved Hide resolved

src/server/server.cc Show resolved Hide resolved

Update src/server/server.cc

506152b

git-hulk dismissed torwig’s stale review via 506152b October 29, 2023 08:57

git-hulk requested review from mapleFU and torwig October 29, 2023 09:02

mapleFU reviewed Oct 29, 2023

View reviewed changes

src/server/server.cc Outdated Show resolved Hide resolved

Remove stale comment

f08f907

mapleFU approved these changes Oct 29, 2023

View reviewed changes

torwig approved these changes Oct 29, 2023

View reviewed changes

git-hulk merged commit b29ad21 into apache:unstable Oct 30, 2023

git-hulk mentioned this pull request Oct 31, 2023

Graceful shutdown the workers when reducing worker threads #1863

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to change the worker thread number in-flight #1855

Allow to change the worker thread number in-flight #1855

git-hulk commented Oct 24, 2023 •

edited

Loading

guojidan commented Oct 25, 2023

git-hulk commented Oct 25, 2023

mapleFU commented Oct 26, 2023

git-hulk commented Oct 26, 2023

git-hulk Oct 27, 2023

git-hulk Oct 27, 2023

torwig left a comment

torwig left a comment

mapleFU commented Oct 29, 2023

git-hulk commented Oct 29, 2023

mapleFU commented Oct 29, 2023

mapleFU left a comment

git-hulk commented Oct 30, 2023

Allow to change the worker thread number in-flight #1855

Allow to change the worker thread number in-flight #1855

Conversation

git-hulk commented Oct 24, 2023 • edited Loading

guojidan commented Oct 25, 2023

git-hulk commented Oct 25, 2023

mapleFU commented Oct 26, 2023

git-hulk commented Oct 26, 2023

git-hulk Oct 27, 2023

Choose a reason for hiding this comment

git-hulk Oct 27, 2023

Choose a reason for hiding this comment

torwig left a comment

Choose a reason for hiding this comment

torwig left a comment

Choose a reason for hiding this comment

mapleFU commented Oct 29, 2023

git-hulk commented Oct 29, 2023

mapleFU commented Oct 29, 2023

mapleFU left a comment

Choose a reason for hiding this comment

git-hulk commented Oct 30, 2023

git-hulk commented Oct 24, 2023 •

edited

Loading