-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default EKS storageclass to use EFS, not GP2 #918
Comments
Isn't there some issue with I have some notes from Andrew/Vlad that you can work around this with something like this:
But that feels more like a hack than a solution, so I don't think this should go into the docs as the recommended configuration. |
IIUC, this work around is currently required when using EFS but using GP2 will not scale. Need details from @colstrom as to why this is so, but it has to do with the number of mounts allowed per node(?) when using GP2. |
So there's two different issues at play here, and I think that's where the confusion comes from. The first issue is with There are (at least) two hints we can use to detect this scenario. EBS Volumes have a maximum size of 16 TiB, and EFS presents as petabytes, so if the volume size is above some threshold, that suggests we may be in this scenario. The other hint is that EFS mounts as type The other issue is that EC2 Instances have a hard upper bound on the number of EBS Volumes that can be attached to a given instance. This varies by instance type, depending on the "instance storage" included with the instance, but on most of the current generation, the cap is ~25 volumes (ish). Things get a bit dodgy when you try to attach another volume beyond that. The EBS Volume provisions successfully, and then sits in an
As far as detecting goes, the limit for a given node can be found with: kubectl get node NODE_NAME -ojsonpath='{.status.allocatable.attachable-volumes-aws-ebs}' And the current number of attached volumes can be found with: kubectl get node NODE_NAME -ojson | jq -r '.status.volumesAttached[].name' | awk -F : 'BEGIN { volumes = 0 } $1 == "kubernetes.io/aws-ebs/aws" { volumes++ } END { print volumes }' Using this information (likely obtained another way), we may be able to detect when this scenario is present, or likely to occur, and either advise the user accordingly, or bias the scheduling somehow to minimize the issue. If If there's anything I've missed here, please let me know, and I'd be happy to fill in any gaps. |
Based on some recent observations on EKS deployments with CAP 2, we need to encourage admins to deploy with EFS as the backing storage. We've mentioned GP2 in the documentation before but there is a limit to how many PVCs can be joined to that storage unit. It also cannot span multiple zones in the way that EFS does, though we may have to be mindful that EFS has its own limits on write operations per second.
@troytop can provide more details.
The text was updated successfully, but these errors were encountered: