Skip to content

Commit

Permalink
Bug#36179509 NdbInfo restart_info reports wrong state when there are …
Browse files Browse the repository at this point in the history
…no subscribers

Problem:
When a Node is started with no nodegroup assigned and
there is no subscribers connected in the cluster the
NdbInfo restart_info will be reported as "Wait handover
of subscriptions" instead the expected state "Restart
completed".
This happens because when there are no buckets to handover
nor subscriptions reports to send out dict lock is not
locked and this way the update of the ndbinfo.restart_state
is not done. Update of the ndbinfo.restart_state is triggered
by the dict unlock.

Solution:
In check_start_handover force dict lock also when there
are no buckets to handover or subscriptions reports to
send out. This way DIH will change restart_state of the
starting node to "Restart completed".

Change-Id: I07d3ef6f4e28adf65dd59fad74fe6947f70c5990
  • Loading branch information
vinc13e committed Feb 6, 2024
1 parent beb8474 commit 791077c
Showing 1 changed file with 25 additions and 9 deletions.
34 changes: 25 additions & 9 deletions storage/ndb/src/kernel/blocks/suma/Suma.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,7 @@ void Suma::execDICT_LOCK_CONF(Signal *signal) {

DictLockConf *conf = (DictLockConf *)signal->getDataPtr();
Uint32 state = conf->userPtr;
SubscriptionPtr subPtr;

switch (state) {
case DictLockReq::SumaStartMe:
Expand All @@ -450,6 +451,17 @@ void Suma::execDICT_LOCK_CONF(Signal *signal) {
return;
case DictLockReq::SumaHandOver:
jam();
if ((c_no_of_buckets == 0) && (!c_subscriptions.first(subPtr))) {
/**
* no subscriptions to report
* Continue restart
*/
jam();
send_dict_unlock_ord(signal, DictLockReq::SumaHandOver);
sendSTTORRY(signal);
return;
}

/**
* All subscribers are now connected.
* Report subscriptions details to all the subscribers.
Expand Down Expand Up @@ -806,6 +818,7 @@ void Suma::execAPI_START_REP(Signal *signal) {

void Suma::check_start_handover(Signal *signal) {
if (c_startup.m_wait_handover) {
jam();
NodeBitmask tmp;
tmp.assign(c_connected_nodes);
tmp.bitAND(c_subscriber_nodes);
Expand All @@ -814,15 +827,18 @@ void Suma::check_start_handover(Signal *signal) {
}
c_startup.m_wait_handover = false;
SubscriptionPtr subPtr;
// Lock the dict only if there are any buckets to handover or
// there are subscriptions whose reports need to be sent out
if (c_no_of_buckets || c_subscriptions.first(subPtr)) {
jam();
send_dict_lock_req(signal, DictLockReq::SumaHandOver);
} else {
jam();
sendSTTORRY(signal);
}

/** Lock the dict.
* Lock is needed because, at least, one of the three following conditions
* is always met:
* 1. There are any buckets to handover
* 2. There are subscriptions whose reports need to be sent out
* 3. No buckets to handover nor subscriptions reports to send out (e.g
* start node with no nodegroup assigned and no subscribers connected), but
* lock is needed to force dict to trigger DIH to update the ndbinfo
* restart_state to RESTART COMPLETED
*/
send_dict_lock_req(signal, DictLockReq::SumaHandOver);
}
}

Expand Down

0 comments on commit 791077c

Please sign in to comment.