Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [aws-efs-csi-driver]: Update DaemonSet rollout strategy to 10% unavailable #2484

Open
iandrewt opened this issue Dec 4, 2024 · 0 comments
Labels
EKS Add-Ons EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@iandrewt
Copy link

iandrewt commented Dec 4, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
Currently, other DaemonSet add-ons like aws-ebs-csi-driver, vpc-cni and kube-proxy do their updates with this updateStrategy:

updateStrategy:
  rollingUpdate:
    maxSurge: 0
    maxUnavailable: 10%
  type: RollingUpdate

Whereas aws-efs-csi-driver has this policy:

updateStrategy:
  rollingUpdate:
    maxSurge: 0
    maxUnavailable: 1
  type: RollingUpdate

With a maxUnavailable: 10% setting, this could be done way faster, as in the >100 nodes case, it can do 10 pods at a time.

Can we get the EFS driver policy to match other add-ons, or have it be configurable in the add-on configuration?

Which service(s) is this request for?
EKS aws-efs-csi-driver managed add-on

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
In a large cluster (>100 nodes), this can make updating the add-on timeout as it waits for the desired replicas.

Are you currently working around this issue?
Re-running the add-on update succeeds as the image is cached on the nodes by this point, so the rollout is faster.

We could take advantage of Managed Fields to take ownership of this one piece of the DaemonSet, but this would be a more complex solution that could break if EKS adds management of this field in future.

We could also do something similar to Cilium's pre-flight where we make each node download the image before we update the add-on, but again, this is an overkill solution.

@iandrewt iandrewt added the Proposed Community submitted issue label Dec 4, 2024
@georgejohnis georgejohnis added EKS Add-Ons EKS Amazon Elastic Kubernetes Service labels Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Add-Ons EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests

2 participants