Optimizing Kafka Stretch Clusters: Tackling Leader Election Instability with Submariner, Istio, and Cilium #14
-
In a Kafka stretch cluster deployed across multiple Kubernetes clusters using Submariner for cross-cluster communication, a leader election issue arises due to high network latency between clusters. The Raft-based controller quorum struggles to achieve consensus, leading to frequent leadership changes and instability. What are the possible root causes of this instability? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hope below stuff answers your question .. Root Causes
Optimizations for Kafka & Submariner
Istio-based SolutionIstio is still under investigation.. So I don't have an answer on it yet.. Cilium-based SolutionI think you should utilize Cilium’s ClusterMesh for more efficient inter-cluster service discovery and load balancing. Enable BPF-based policies to ensure efficient packet processing instead of relying on traditional tunneling. Monitor XDP (eXpress Data Path) stats to analyze and reduce packet processing delays. |
Beta Was this translation helpful? Give feedback.
Hope below stuff answers your question ..
Root Causes
controller.quorum.voters
is not optimally distributed across clusters, it may lead to uneven voting power and split-brain scenarios.Optimizations for Kafka & Submariner