You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have found out that sometimes (i.e., 1 out of 20 attempts) during the startup of StrimziKafkaCluster with multiple nodes some nodes may be out-of-sync. Meaning that in broker logs one can see:
doesn't match the local key (OptionalInt[1], BKan2iyJb8cHcg-clBpk7A); rejecting the vote (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] VoterkeyforVOTEorBEGIN_QUORUM_EPOCHrequestdidn't match the receiver'sreplicakey: broker-2:9094 (id: 2rack: null) (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] Candidatesentavoterkey (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) intheVOTErequestthatdoesn't match the local key (OptionalInt[1], BKan2iyJb8cHcg-clBpk7A); rejecting the vote (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] VoterkeyforVOTEorBEGIN_QUORUM_EPOCHrequestdidn't match the receiver'sreplicakey: broker-2:9094 (id: 2rack: null) (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] Candidatesentavoterkey (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) intheVOTErequestthatdoesn't match the local key (OptionalInt[1], BKan2iyJb8cHcg-clBpk7A); rejecting the vote (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] VoterkeyforVOTEorBEGIN_QUORUM_EPOCHrequestdidn't match the receiver'sreplicakey: broker-2:9094 (id: 2rack: null) (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] Candidatesentavoterkey (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) intheVOTErequestthatdoesn't match the local key (OptionalInt[1], BKan2iyJb8cHcg-clBpk7A); rejecting the vote (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] VoterkeyforVOTEorBEGIN_QUORUM_EPOCHrequestdidn't match the receiver'sreplicakey: broker-2:9094 (id: 2rack: null) (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] Candidatesentavoterkey (Optional[ReplicaKey(id=2, directoryId=Optional.empty)]) intheVOTErequestthatdoesn't match the local key (OptionalInt[1], BKan2iyJb8cHcg-clBpk7A); rejecting the vote (org.apache.kafka.raft.KafkaRaftClient)
[2024-11-2509:22:31,800] INFO [RaftManagerid=1] VoterkeyforVOTEorBEGIN_QUORUM_EPOCHrequestdidn't match the receiver'sreplicakey: broker-2:9094 (id: 2rack: null) (org.apache.kafka.raft.KafkaRaftClient)
[
Eventually leading to delayed tests (instead of 14s execution time, the test will run approx 65s)
This is the same test case with 0.108.0 (which uses Kafka 3.8.0) and it seems fine.
The error means the there is a node thinks "node 1" is "node 2", so it fails the vote. The log seems get truncated and miss the earlier part. Do we have the complete logs from controller nodes?
Also, this failure happened after adopting kafka v3.9.0, right?
The error means the there is a node thinks "node 1" is "node 2", so it fails the vote. The log seems get truncated and miss the earlier part. Do we have the complete logs from controller nodes?
Also, this failure happened after adopting kafka v3.9.0, right?
For clarity, this is the same test case but with Kafka 3.8.1 (which is fine). Each test run takes about 8-10s but with Kafka 3.9.0 it takes 15s (and if the case of that problem about 60 s).
I have found out that sometimes (i.e., 1 out of 20 attempts) during the startup of StrimziKafkaCluster with multiple nodes some nodes may be out-of-sync. Meaning that in broker logs one can see:
and other nodes have
Eventually leading to delayed tests (instead of 14s execution time, the test will run approx 65s)
This is the same test case with 0.108.0 (which uses Kafka 3.8.0) and it seems fine.
Test log:
[1] - https://gist.github.com/see-quick/e707c1b7e7d6da9ee4ed2ac32ad02df4
[2] - https://gist.github.com/see-quick/43a56c1683babad905c002708a674d49
The text was updated successfully, but these errors were encountered: