-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow configuration of kubernetes leader election parameters #663
Comments
Another possibility is to split the leader election as a separate deployment from the worker daemonset:
|
This brings up an interesting point. Leader election was probably never built for applications running in a daemonset, especially in such a large cluster. I do see how this is a problem for you. Using leader election is probably not the best solution long term for Spegel. It was used as the best solutions for two problems that had to be solved for bootstrapping. The first is that for a peer to connect to another it needs the ID of the peer which includes a randomly generated public key. The second is that all peers need to agree on the same set of peer(s) to initially connect to. If a random peer was selected a split cluster could in theory have been created. I am happy to discuss alternatives to solve this problem. The two main things that any solution needs to provide is that the public key needs to e shared, and the same peers need to be selected. One option is for example to choose the oldest Spegel instances, but that does not solve the sharing public key part. |
Happy new years! How do you feel about having a small "leader-election" group of pods, and then a daemonset that actually performs caching? I think this lets us keep the simple-to-understand leader election without causing spamming of the kubernetes api |
I think it is a good solution for users with large clusters. We could make this an opt in feature that is disabled by default. When enabled a Deployment with n (3?) replicas is created that do leader election among themselves. Then all other Daemonset pods use the leader to bootstrap. This way we will have a known amount of pods doing leader election in very large clusters. There has to be a better solution for this problem that I am not seeing yet, but striving for perfection wont solve things today so its better to revisit this at a later time. |
Describe the problem to be solved
Hello, thanks again for this great library. Some of our clusters have a large number of nodes (>1000), and the leader election has become a significant portion of the requests made to the k8s API server. As a stopgap, we're considering tweaking the leader election parameters to reduce the number of lease calls made to the k8s API server.
Currently those value are hard-coded here.
Proposed solution to the problem
A couple of questions/notes:
Another option we're considering is using the kubernetes endpoint to discover peers (equivalent of
kubectl get endpoints spegel
, but for port 5001) to maybe-circumvent the need to use a leader to discover peers, but I'm not too clear if this'll cause weird behavior, would appreciate feedback thereThe text was updated successfully, but these errors were encountered: