Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat(split): supplement two cases after implementing cancel split #742

Merged
merged 1 commit into from
Feb 18, 2021

Conversation

hycdong
Copy link
Contributor

@hycdong hycdong commented Feb 1, 2021

In pervious pull requests, cancel partition split is implemented, after adding those functions, we need to handle some cases specially.

  1. In function register_child_on_meta
  • when all replicas in group finish split, primary parent will send rpc to register child on meta server, if split is canceled, this register rpc should be failed
  1. In function on_config_sync
  • when partition split is canceled, the useless child replica is still in replica server memory, it will be included in on_config_sync rpc, meta server should not assert in this case, should let replica server gc this replica.

@hycdong hycdong changed the title feat(split): supplement two cases after implementing pause cancel split feat(split): supplement two cases after implementing cancel split Feb 1, 2021
@hycdong hycdong marked this pull request as ready for review February 7, 2021 08:04
Comment on lines 836 to +854
if (app == nullptr || rep.pid.get_partition_index() >= app->partition_count) {
// app is not recognized or partition is not recognized
dassert(false,
"gpid(%d.%d) on node(%s) is not exist on meta server, administrator "
"should check consistency of meta data",
rep.pid.get_app_id(),
rep.pid.get_partition_index(),
request.node.to_string());
// This app has garbage partition after cancel split, the canceled child
// partition should be gc
if (app != nullptr &&
rep.pid.get_partition_index() < app->partition_count * 2 &&
rep.status == partition_status::PS_ERROR) {
response.gc_replicas.push_back(rep);
dwarn_f("notify node({}) to gc replica({}) because it is useless partition "
"which is caused by cancel split",
request.node.to_string(),
rep.pid);
} else {
// app is not recognized or partition is not recognized
dassert(false,
"gpid({}) on node({}) is not exist on meta server, administrator "
"should check consistency of meta data",
rep.pid,
request.node.to_string());
}
Copy link
Contributor

@neverchanje neverchanje Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many embedded levels, you should have sensed a very bad smell in this code. Try to make this big if into a function.

// check_partition_index_validity returns false if the replica is should be GCed.
bool check_partition_index_validity(const shared_ptr<app_state>& app, gpid request_pid) {
  bool should_assert=false;
  if (app == nullptr) {
     should_assert=true;
  } else if (rep.status != partition_status::PS_ERROR && rep.pid.get_partition_index() >= app->partition_count * 2) {
    should_assert=true;
  }
  if (should_assert) {
    dassert(false);
  }
}

if (!check_partition_index_validity( req.pid)) {
  response.gc_replicas.push_back(rep);
}

@hycdong hycdong merged commit 198962f into XiaoMi:master Feb 18, 2021
@hycdong hycdong deleted the handle_stop_case branch February 18, 2021 02:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants