-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Faster cluster provisioning and autoscaling #1231
Comments
Sounds cool. |
may an alternative for networking could be kilo -> https://kilo.squat.ai/ |
Is there any updates about this feature? |
Hi folks, would you please indicate what's more important to you to speed-up at this stage? Is it a cluster provisioning duration or scale-up/scale-down event? There are currently two options we're looking into:
The answer to the initial question would help us prioritize better. |
It's definitely scale-up/scale-down event(autoscaling) for me, as the initial provision is not so critical. |
I've measured the autoscaler speed on GCP. Adding a node takes around 8m30s from the first pod in a The scale-up performance would be likely slower on Hetzner, as I don't expect the Hetzner Ubuntu images coming pre-configured for Hetzner-hosted APT repository mirrors like GCP does, so the impact might be more significant there. But we won't get to a faster scale-up than 8m0s. So if this is not sufficient, we need to look for other alternatives. |
Motivation
A significant portion of the cluster provisioning and autoscaler execution time takes the download of various packages and binaries. There's a room to optimize this part.
Description
We could speed this up by utilizing our pre-populated images on providers that allow this. This would result in a faster
ansibler
andkube-eleven
execution as the binaries would already be present. And for the cases when not (e.g. on providers where we can't deploy our images or on static nodes), the usualansibler
andkube-eleven
flows will take care of the download.Note 1: We should assess this approach against custom pre-baked Flatcar, Fedora Core OS or OpenSuse MicroOS images.
Note 2: It would help tremendously in this task if we knew whether we can get rid of Wireguard and utilize just Cilium for bridging nodes across various networks.
Exit criteria
FYI @MiroslavRepka
The text was updated successfully, but these errors were encountered: