Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat(split): add meta control split #679

Merged
merged 6 commits into from
Jan 13, 2021

Conversation

hycdong
Copy link
Contributor

@hycdong hycdong commented Nov 27, 2020

Partition split may copy lots of data, making heavy load, it is safe to provide the function pause or cancel it. As a result, we provide three types to control split: pause, restart, cancel.
For pause and restart, admin user can set parent_pidx as a valid partition index, server will only try to pause or restart that specific partition split process. If parent_pidx = -1, server will pause all spitting partition or restart all paused partition.
For cancel, parent_pidx will always be -1, but it should set old_partition_count to avoid duplicated cancel request.
This pull request implement ddl-client control split interface and how meta server handle control split request. What meta server does is update split_status for the partition, and this value will pass to primary replica through on_config_sync. How replica server handle different split_status will be added in further pull request.

@hycdong hycdong marked this pull request as ready for review December 3, 2020 08:33
@@ -48,6 +57,19 @@ class meta_split_service
void
on_add_child_on_remote_storage_reply(error_code ec, register_child_rpc rpc, bool create_new);

static const std::string control_type_str(split_control_type::type type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about use ENUM_REG for split_control_type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If split_control_type is defined by ENUM_REG, the string will be like dsn::replication::split_control_type::PSC_PAUSE, the function control_type_str is only used when printing log or hint message to show the control type, I think the string using ENUM_REG is a little bit too long.

Comment on lines +197 to +201
error_with<control_split_response>
control_partition_split(const std::string &app_name,
split_control_type::type control_type,
const int32_t parent_pidx,
const int32_t old_partition_count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to consider reimplement the shell command in admin-cli. But certainly, it's not urgent, and the reimplementation will be simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right~ I have already added three interfaces easier to use.

@@ -48,6 +57,19 @@ class meta_split_service
void
on_add_child_on_remote_storage_reply(error_code ec, register_child_rpc rpc, bool create_new);

static const std::string control_type_str(split_control_type::type type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use _split_control_type_VALUES_TO_NAMES. This is certainly more extensible. When you add a new control type, you don't have to update this function.

Btw, when the returned type is a value, you don't have to add const.

static const char* control_type_str(split_control_type::type type) {
  auto it = _split_control_type_VALUES_TO_NAMES.find(type);
  if(it == _split_control_type_VALUES_TO_NAMES.end()) {
    return "unknown";
  }
  return it->second;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I left the comment to @levy5307, this function is used to print short name of control_type, and it is hardly to add another new type, I think it is okay to remain the implementation here.

@@ -890,6 +890,38 @@ struct start_partition_split_response
2:string hint_msg;
}

enum split_control_type
{
PSC_PAUSE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the enums are defined in shrift, mean we will use them as split_control_type:: PSC_PAUSE , is it needed to add the confusing prefix PSC

Copy link
Contributor Author

@hycdong hycdong Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PSC is short for partition split control, actually it is redundant, I have already removed it.


auto iter = app->helpers->split_states.status.find(parent_pidx);
if (iter == app->helpers->split_states.status.end()) {
response.err = control_type == split_control_type::PSC_PAUSE ? ERR_CHILD_REGISTERED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why return different err code? shoudn't it be all ERR_INVALID_STATE?

Copy link
Contributor Author

@hycdong hycdong Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user tries to pause split has already finished, ERR_CHILD_REGISTERED may be better to describe split is finished.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's also ok(actually the hint message is enough)


if (iter->second == split_status::PAUSING || iter->second == split_status::PAUSED) {
dwarn_f("duplicated pause request for app({}), partition[{}]", app_name, parent_pidx);
response.err = ERR_OK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERR_OK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an old implementation problem, I have already updated and refacored it as your comment.

src/meta/meta_split_service.cpp Outdated Show resolved Hide resolved
src/meta/meta_split_service.cpp Show resolved Hide resolved
@neverchanje neverchanje merged commit 20b2dfc into XiaoMi:master Jan 13, 2021
@hycdong hycdong deleted the meta_control_split branch January 13, 2021 04:44
zhangyifan27 pushed a commit to zhangyifan27/rdsn that referenced this pull request Jan 14, 2021
zhangyifan27 pushed a commit to zhangyifan27/rdsn that referenced this pull request Jan 14, 2021
zhangyifan27 pushed a commit to zhangyifan27/rdsn that referenced this pull request Jan 25, 2021
zhangyifan27 pushed a commit to zhangyifan27/rdsn that referenced this pull request Jan 26, 2021
zhangyifan27 pushed a commit to zhangyifan27/rdsn that referenced this pull request Jan 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants