Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 Coordinators Elected Leader #16411

Closed
razinbouzar opened this issue May 7, 2024 · 5 comments · Fixed by #16425 or #16528 · May be fixed by #16617
Closed

2 Coordinators Elected Leader #16411

razinbouzar opened this issue May 7, 2024 · 5 comments · Fixed by #16425 or #16528 · May be fixed by #16617

Comments

@razinbouzar
Copy link
Contributor

Please provide a detailed title (e.g. "Broker crashes when using TopN query with Bound filter" instead of just "Broker crashes").

Affected Version

28.0.1 (also observed in v25)
ZK version 3.7

Description

During patching of our underlying EKS nodes, we observe a condition wherein 2 coordinators are elected leader. When we encounter this condition, we see multiple task failures across different data sources.

image

@razinbouzar
Copy link
Contributor Author

Another observation is that this condition occurred during a ZK leader election change.
zk-leader-change

@gianm
Copy link
Contributor

gianm commented May 9, 2024

We saw a double-leader situation recently when a ZK server cycled, and we suspect it has something to do with https://issues.apache.org/jira/browse/CURATOR-696. That Curator Jira suggests a bug was introduced by https://issues.apache.org/jira/browse/CURATOR-644 (PR: apache/curator#430).

It seems possible that this did introduce a bug, since that changed the logic from doing reset() always on reconnection (which would recreate the ephemeral znode) to doing getChildren(), which would look for existing ones, and then only call reset() if they could not be found.

We updated to Curator 5.4 some time ago, in #13302. So if this is indeed what’s going on, it has potentially been an issue since Druid 25.

What we saw specifically was this scenario:

  • OL 1 was leader prior to ZK connection loss

  • OL 1 reconnected to ZK and got a session id that we believe is a new session id (although we were not able to confirm that)

  • OL 1's LeaderLatch recipe checked the latch patch and saw an ephemeral znode there that it believed was its own, so it started leadership.

  • OL 2, 30s later, checked the latch path and saw no children at all (not even the one for OL 1). It created an ephemeral znode for itself, and started leadership.

We think what happened is that both OLs established new sessions, even though the old sessions hadn’t expired yet. Because the old sessions hadn’t expired yet, the old ephemeral znodes were still there upon reconnection. The old leader, OL 1, saw both old znodes there and assumed it was still leader. But because those znodes were associated with different sessions, they went away in 30s. When OL 2 noticed that, it assumed there was no active leader, so it became one and then we had two leaders.

@gianm
Copy link
Contributor

gianm commented May 9, 2024

I commented on CURATOR-696 linking back here.

@razinbouzar
Copy link
Contributor Author

@cryptoe can we re-open this issue since #16425 was reverted in #16445?

@kfaraz kfaraz reopened this May 16, 2024
@kfaraz kfaraz self-assigned this May 16, 2024
@razinbouzar
Copy link
Contributor Author

@gianm Curator 5.7.0 includes the fix for https://issues.apache.org/jira/browse/CURATOR-696. I'm unsure when this version will be made available, but have asked here.

razinbouzar pushed a commit to razinbouzar/druid that referenced this issue May 31, 2024
Added listener method that tracks ZK leader state
razinbouzar pushed a commit to razinbouzar/druid that referenced this issue Jun 17, 2024
Added listener method that tracks ZK leader state
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment