-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve Akka.Cluster / Akka.Remote DeadLetter
logging
#7149
improve Akka.Cluster / Akka.Remote DeadLetter
logging
#7149
Conversation
Racy spec:
Somehow, it looks like a state transition was missed between
That |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detailed my changes
@@ -44,7 +44,6 @@ public string ShardId(object message) | |||
=> message switch | |||
{ | |||
int i => (i % 10).ToString(), | |||
ShardRegion.StartEntity se => (int.Parse(se.EntityId) % numberOfShards).ToString(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed per analyzer warning
@@ -476,7 +476,7 @@ public async Task A_cluster_must_leave_via_CoordinatedShutdownRun() | |||
await probe.ExpectMsgAsync<ClusterEvent.MemberLeft>(); | |||
// MemberExited might not be published before MemberRemoved | |||
var removed = (ClusterEvent.MemberRemoved)await probe.FishForMessageAsync(m => m is ClusterEvent.MemberRemoved); | |||
removed.PreviousStatus.Should().BeEquivalentTo(MemberStatus.Exiting); | |||
new [] {MemberStatus.Exiting, MemberStatus.Leaving}.Should().Contain(removed.PreviousStatus); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for racy spec - we'd already implemented this for one of the other "node leaving" specs.
The problem is that the PublishChanges
message gets DeadLetter
'd between the exiting and the removed statuses - the Removed
status is generated, in this case, from the singleton cluster shutting itself down, rather than from gossip messages. This is a timing issue that mostly effects the spec - has no real-world impact. So we accept either of these two "terminating" statuses instead.
@@ -257,11 +257,11 @@ public void A_gossip_must_find_two_oldest_per_role_as_targets_for_Exiting_change | |||
Member a8 = TestMember.Create(new Address("akka.tcp", "sys", "a8", 2552), MemberStatus.Exiting, ImmutableHashSet<string>.Empty.Add("role1"), upNumber: 8); | |||
Member a9 = TestMember.Create(new Address("akka.tcp", "sys", "a9", 2552), MemberStatus.Exiting, ImmutableHashSet<string>.Empty.Add("role2"), upNumber: 9); | |||
|
|||
IEnumerable<Member> theExiting = new Member[] { a5, a6 }; | |||
var theExiting = new Member[] { a5, a6 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned this up to just use an IReadOnlyCollection<Member>
- to avoid some of the multiple enumeration problems GossipTargetsForExitingMembers
was susceptible to.
@@ -2339,15 +2338,19 @@ public void LeaderActionsOnConvergence() | |||
} | |||
|
|||
PublishMembershipState(); | |||
GossipExitingMembersToOldest(changedMembers.Where(i => i.Status == MemberStatus.Exiting)); | |||
GossipExitingMembersToOldest(changedMembers.Where(i => i.Status == MemberStatus.Exiting).ToArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turn capture an array and pass it here, in order to solve multiple enumeration problems. No real perf hit here - only happens when leaders are converging on cluster membership status changes.
} | ||
|
||
/// <summary> | ||
/// Gossip the Exiting change to the two oldest nodes for quick dissemination to potential Singleton nodes | ||
/// </summary> | ||
/// <param name="exitingMembers"></param> | ||
private void GossipExitingMembersToOldest(IEnumerable<Member> exitingMembers) | ||
private void GossipExitingMembersToOldest(IReadOnlyCollection<Member> exitingMembers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for the multi-enumeration problem.
@@ -711,7 +703,7 @@ private void Gated(bool writerTerminated, bool earlyUngateRequested) | |||
// remote address at the EndpointManager level stopping this actor. In case the remote system becomes reachable | |||
// again it will be immediately quarantined due to out-of-sync system message buffer and becomes quarantined. | |||
// In other words, this action is safe. | |||
if (_bailoutAt != null && _bailoutAt.IsOverdue) | |||
if (_bailoutAt is { IsOverdue: true }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReSharper suggested some pattern matching here.
@@ -1195,7 +1187,7 @@ private void Initializing() | |||
{ | |||
// Assert handle == None? | |||
Context.Parent.Tell( | |||
new ReliableDeliverySupervisor.GotUid((int)handle.ProtocolHandle.HandshakeInfo.Uid, RemoteAddress)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary cast.
} | ||
|
||
private Receive ReceiveWithAlwaysLogging() | ||
private bool ReceiveWithAlwaysLogging(object message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor perf optimization - avoid allocating new behaviors here if we can help it.
} | ||
|
||
private Receive ReceiveWithMaxCountLogging() | ||
private bool ReceiveWithMaxCountLogging(object message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactoring to be just a method, instead of allocating a new Receive
each time
@@ -27,7 +27,6 @@ public MessageExtractor(int maxNumberOfShards) : base(maxNumberOfShards) | |||
public override object EntityMessage(object message) | |||
=> message switch | |||
{ | |||
ShardingEnvelope e => e.Message, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolving build warning by removing automatically handled sharding messages
DeadLetter
loggingDeadLetter
logging
|
||
public override string ToString() | ||
{ | ||
return $"GossipEnvelope(from={From}, to={To}, gossip={Gossip}, deadline={Deadline})"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the primary DeadLetter
logging change I wanted to make here - should make it much easier to understand what was lost in the DeadLetter
pile in terms of missing gossip messages when there's things shutting down inside cluster nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Changes
Various cleanup inside Akka.Cluster and Akka.Remote in order to make it easier to debug cluster gossip issues.
Checklist
For significant changes, please ensure that the following have been completed (delete if not relevant):