Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design architecture #16

Open
IamEddie opened this issue Nov 15, 2024 · 6 comments
Open

Design architecture #16

IamEddie opened this issue Nov 15, 2024 · 6 comments
Assignees

Comments

@IamEddie
Copy link
Contributor

Design a couple of architectures for model deployments

@IamEddie IamEddie self-assigned this Nov 15, 2024
@IamEddie IamEddie moved this to In Progress in AI2050 Nov 15, 2024
@IamEddie
Copy link
Contributor Author

EKS-cluster

@IamEddie
Copy link
Contributor Author

eos-EKS-cluster

@GemmaTuron
Copy link
Member

Hi @IamEddie

Thanks for this, As we have discussed today please prepare a head-to-head comparison of the deployment strategy using terraform and amazon EKS vs using EC2 containers.

Provide the comparison in terms of:

  • Cloud maintenance (is one system easier to maintain than another?)
  • Cost: what is the cost for keeping 20 models that are being used weekly? Please use the AWS cost calculator to that end so we can have numbers as close to reality as possible
  • Cost in long term: is one option better to have models dormant that only activate upon request?
  • Latency: are both systems equally fast?

@IamEddie
Copy link
Contributor Author

Hi @GemmaTuron/Team

  • Yes, from Cloud maintenance perspective I would say EKS is more organised as compared to running EC2

  • Cost: Take it as one model=one EC2, therefore we will need 20 EC2s for 20 models? Please see the estimate AWS cost calculator screenshot and attached CSV file
    EKS cluster and 4 EC2 instances
    1EKS4EC2

20 models Deployed to 20 EC2 Instances
20EC2Instances

  • Cost in long term: Yes, with EKS it is easier to have to have models dormant/inactive and then only upon request and it is a couple of commands rather than with EC2 have to deploy one model at a time

  • Latency? Micro-services are lightweight in themselves, this would also depend on the actual application as well
    1EKS3EC2Estimate.csv
    20EC2Instances.csv

@GemmaTuron
Copy link
Member

Hi @IamEddie

Can you clarify why with EKS we need only 4 ECS instances? How are you calculating how many models will run in each instance? And how would you do the dormant strategy? Can you expand on that for both architectures?
I think it is best if you prepare a report with detailed answers to all those questions, really add as much detail as possible and comparisons so that next team meeting this can be discussed. Thanks!

@IamEddie
Copy link
Contributor Author

EC2 Deployment and more on why EKS

EC2 Deployment
EC2-deployment

Considering the choices available, it's a selection between EC2, ECS or EKS. I have decided to go with EKS as a deployment method because:

  • Much better option in terms of management(AWS setup and manage the Control plane and we will focus on managing the worker nodes)
  • Makes deployment and scalability easier
  • Gives us observability of our infrastructure
  • Leverage open-source community support
  • Kubernetes extensibility
  • Avoid vendor lock-in

How many models can run in each container, when can models be switched on/off etc?

  • one container=1 model
  • One node => host N-number of models e.g. one node containing 5 models
  • Models can be switched on/off at and when is needed/not needed
  • If we choose the EC2 instances for 20 models, the logic is that we will require 20 instances with each EC2 instance hosting one model

Can you clarify why with EKS we need only 4 ECS instances?

  • I suppose with EKS we can start with one instance as Minimum number and set 4 as the Maximum number.

How are you calculating how many models will run in each instance?

  • This number is generic and can increment as resources would be required for the additional models being launched. We provided this as a starting point and we will review as we make deployments.

And how would you do the dormant strategy? Can you expand on that for both architectures?

  • It is easier to provision and manage deployments. If we are not using or utilizing the models, we scale them down to a minimum number and eventually shut them if they are not in use at all. Vise versa when we need them, can launch as well as scale them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants