Skip to content

Commit

Permalink
Fix testFollowerCheckerDetectsUnresponsiveNodeAfterMasterReelection (e…
Browse files Browse the repository at this point in the history
…lastic#84200)

This test would fail if we introduce the network partition while the
master is still publishing a cluster state update and hasn't received
the ack from the victim node. In this case the default publish timeout
means that the master will wait for 30s before completing the stalled
publication and moving on to the `node-left` one, but
`ensureStableCluster` also times out after 30s which leaves not much
time for the master to remove the victim node.

This commit reduces the publish timeout to 10s so that the master
recovers well before `ensureStableCluster` times out.

Closes elastic#84172
  • Loading branch information
DaveCTurner committed Feb 22, 2022
1 parent d885bc4 commit bf1668b
Showing 1 changed file with 1 addition and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ public void testFollowerCheckerDetectsUnresponsiveNodeAfterMasterReelection() th
.put(LeaderChecker.LEADER_CHECK_RETRY_COUNT_SETTING.getKey(), "4")
.put(FollowersChecker.FOLLOWER_CHECK_TIMEOUT_SETTING.getKey(), "1s")
.put(FollowersChecker.FOLLOWER_CHECK_RETRY_COUNT_SETTING.getKey(), 1)
.put(Coordinator.PUBLISH_TIMEOUT_SETTING.getKey(), "10s")
.build()
);
}
Expand Down

0 comments on commit bf1668b

Please sign in to comment.