-
Notifications
You must be signed in to change notification settings - Fork 64
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
terraform: add AWS/EKS deployment for ChatQnA
Signed-off-by: Sakari Poussa <[email protected]>
- Loading branch information
Showing
7 changed files
with
367 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# OPEA applications AWS EKS deployment guide | ||
|
||
This guide shows how to deploy OPEA applications on Amazon Web Service (AWS) Elastic Kubernetes Service (EKS) using Terraform. | ||
|
||
## Prerequisites | ||
|
||
- Access to AWS EKS | ||
- [Terraform](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli), [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [Helm](https://helm.sh/docs/helm/helm_install/) installed on your local machine. | ||
|
||
## Setup | ||
|
||
The setup uses Terraform to create EKS cluster with the following properties: | ||
|
||
- 1-node EKS cluster with 50 GB disk and `m7i.x8large` SPOT instance (16 vCPU and 32 GB memory) | ||
- Cluster autoscaling up to 10 nodes | ||
- Storage Class (SC) `efs-sc` and Persistent Volume Claim (PVC) `model-volume` for storing the model data | ||
- `LoadBalancer` address type for the service for external consumption | ||
- Updates the kubeconfig file for `kubectl` access | ||
|
||
Initialize the Terraform environment. | ||
|
||
```bash | ||
terraform init | ||
``` | ||
|
||
Add OPEA Helm chart repository. | ||
|
||
```bash | ||
helm repo add opea https://opea-project.github.io/GenAIInfra | ||
helm repo update | ||
``` | ||
|
||
## Create EKS cluster | ||
|
||
By default, 1-node EKS cluster is created which suitable for running an OPEA application. See `variables.tf` and `opea-<application-name>.tfvars` if you want to tune the EKS cluster properties, e.g., number of nodes, instance types or disk size. | ||
|
||
## Create Persistent Volume Claim | ||
|
||
OPEA needs a volume where to store the model. For that we need to create Kubernetes Persistent Volume Claim (PVC). OPEA requires `ReadWriteMany` option since multiple pods needs access to the storage and they can be on different nodes. On EKS, only EFS supports `ReadWriteMany`. Ensure `<namespace>` below matches the namespace you deploy the application to. It is better for create the PVC after you have deployed the OPEA application. | ||
|
||
```bash | ||
kubectl apply -f eks-efs-csi-pvc.yaml -n <namescape> | ||
``` | ||
|
||
### ChatQnA | ||
|
||
Use the commands below to create EKS cluster. | ||
|
||
```bash | ||
terraform plan --var-file opea-chatqna.tfvars -out opea-chatqna.plan | ||
terraform apply "opea-chatqna.plan" | ||
``` | ||
|
||
Once the cluster is ready, the kubeconfig file to access the new cluster is updated automatically. By default, the file is `~/.kube/config`. | ||
|
||
Now your should have access to the cluster via `kubectl` command. | ||
|
||
Deploy ChatQnA Application with Helm | ||
|
||
```bash | ||
helm install -n chatqna --create-namespace chatqna opea/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} | ||
``` | ||
|
||
After a while, the OPEA application should be running. You can check the status via kubectl. | ||
|
||
```bash | ||
kubectl get pod -n chatqna | ||
``` | ||
|
||
You can now start using the OPEA application. | ||
|
||
```bash | ||
OPEA_SERVICE=$(kubectl get svc -n chatqna chatqna -ojsonpath='{.status.loadBalancer.ingress[0].hostname}') | ||
curl http://${OPEA_SERVICE}:8888/v1/chatqna \ | ||
-H "Content-Type: application/json" \ | ||
-d '{"messages": "What is the revenue of Nike in 2023?"}' | ||
``` | ||
|
||
Cleanup | ||
|
||
Delete the cluster via the following command. | ||
|
||
```bash | ||
helm uninstall -n chatqna chatqna | ||
terraform destroy -var-file opea-chatqna.tfvars | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
apiVersion: v1 | ||
kind: PersistentVolumeClaim | ||
metadata: | ||
name: model-volume | ||
spec: | ||
accessModes: | ||
- ReadWriteMany | ||
storageClassName: efs-sc | ||
resources: | ||
requests: | ||
storage: 50Gi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
provider "aws" { | ||
region = var.region | ||
} | ||
|
||
provider "kubernetes" { | ||
config_path = "~/.kube/config" | ||
} | ||
|
||
data "aws_availability_zones" "available" { | ||
filter { | ||
name = "opt-in-status" | ||
values = ["opt-in-not-required"] | ||
} | ||
} | ||
data "aws_caller_identity" "current" {} | ||
|
||
locals { | ||
vpc_cidr = "10.0.0.0/16" | ||
azs = slice(data.aws_availability_zones.available.names, 0, 3) | ||
} | ||
|
||
module "vpc" { | ||
source = "terraform-aws-modules/vpc/aws" | ||
|
||
name = "${var.cluster_name}-vpc" | ||
cidr = local.vpc_cidr | ||
azs = local.azs | ||
|
||
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] | ||
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"] | ||
|
||
enable_nat_gateway = true | ||
single_nat_gateway = true | ||
enable_dns_hostnames = true | ||
|
||
public_subnet_tags = { | ||
"kubernetes.io/role/elb" = 1 | ||
} | ||
|
||
private_subnet_tags = { | ||
"kubernetes.io/role/internal-elb" = 1 | ||
} | ||
} | ||
|
||
module "eks" { | ||
source = "terraform-aws-modules/eks/aws" | ||
version = "20.8.5" | ||
|
||
cluster_name = var.cluster_name | ||
cluster_version = var.cluster_version | ||
|
||
cluster_endpoint_public_access = true | ||
enable_cluster_creator_admin_permissions = true | ||
|
||
cluster_addons = { | ||
coredns = { most_recent = true } | ||
kube-proxy = { most_recent = true } | ||
vpc-cni = { most_recent = true } | ||
aws-efs-csi-driver = { most_recent = true } | ||
} | ||
|
||
vpc_id = module.vpc.vpc_id | ||
subnet_ids = module.vpc.private_subnets | ||
|
||
eks_managed_node_group_defaults = { | ||
instance_types = var.instance_types | ||
capacity_type = var.capacity_type | ||
min_size = var.min_size | ||
max_size = var.max_size | ||
desired_size = var.desired_size | ||
block_device_mappings = { | ||
xvda = { | ||
device_name = "/dev/xvda" | ||
ebs = { | ||
volume_size = var.disk_size | ||
volume_type = "gp3" | ||
iops = 3000 | ||
throughput = 125 | ||
encrypted = true | ||
delete_on_termination = true | ||
} | ||
} | ||
} | ||
} | ||
|
||
eks_managed_node_groups = { | ||
default = { | ||
iam_role_additional_policies = { | ||
AmazonEFSCSIDriverPolicy = "arn:aws:iam::aws:policy/service-role/AmazonEFSCSIDriverPolicy" | ||
} | ||
} | ||
} | ||
} | ||
|
||
module "efs" { | ||
source = "terraform-aws-modules/efs/aws" | ||
|
||
# File system | ||
name = var.cluster_name | ||
creation_token = var.cluster_name | ||
encrypted = true | ||
kms_key_arn = module.kms.key_arn | ||
|
||
# File system policy | ||
policy_statements = [ | ||
{ | ||
sid = "Example" | ||
actions = ["elasticfilesystem:ClientMount"] | ||
principals = [ | ||
{ | ||
type = "AWS" | ||
identifiers = [data.aws_caller_identity.current.arn] | ||
} | ||
] | ||
} | ||
] | ||
|
||
# Mount targets / security group | ||
mount_targets = { for k, v in zipmap(local.azs, module.vpc.private_subnets) : k => { subnet_id = v } } | ||
security_group_description = "EFS security group" | ||
security_group_vpc_id = module.vpc.vpc_id | ||
security_group_rules = { | ||
vpc = { | ||
description = "NFS ingress from VPC private subnets" | ||
cidr_blocks = module.vpc.private_subnets_cidr_blocks | ||
} | ||
} | ||
|
||
tags = { | ||
Terraform = "true" | ||
Environment = "dev" | ||
} | ||
} | ||
|
||
module "kms" { | ||
source = "terraform-aws-modules/kms/aws" | ||
|
||
aliases = ["efs/${var.cluster_name}"] | ||
description = "EFS customer managed key" | ||
enable_default_policy = true | ||
|
||
# For example use only | ||
deletion_window_in_days = 7 | ||
} | ||
|
||
resource "null_resource" "kubectl" { | ||
provisioner "local-exec" { | ||
command = "aws eks --region ${var.region} update-kubeconfig --name ${var.cluster_name}" | ||
} | ||
depends_on = [ module.eks ] | ||
} | ||
|
||
resource "kubernetes_storage_class_v1" "eks_efs" { | ||
metadata { | ||
name = "efs-sc" | ||
} | ||
storage_provisioner = "efs.csi.aws.com" | ||
reclaim_policy = "Retain" | ||
parameters = { | ||
provisioningMode = "efs-ap" | ||
fileSystemId: "${module.efs.id}" | ||
directoryPerms = "700" | ||
} | ||
depends_on = [ | ||
null_resource.kubectl | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
cluster_name = "opea-chatqna" | ||
instance_types = ["m7i.8xlarge"] | ||
capacity_type = "SPOT" # cheaper | ||
disk_size = 50 # valid only if the above is false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Copyright (c) HashiCorp, Inc. | ||
# SPDX-License-Identifier: MPL-2.0 | ||
|
||
output "cluster_endpoint" { | ||
description = "Endpoint for EKS control plane" | ||
value = module.eks.cluster_endpoint | ||
} | ||
|
||
output "cluster_oidc_issuer_url" { | ||
description = "The URL on the EKS cluster for the OpenID Connect identity provider" | ||
value = module.eks.cluster_oidc_issuer_url | ||
} | ||
|
||
output "cluster_security_group_id" { | ||
description = "Security group ids attached to the cluster control plane" | ||
value = module.eks.cluster_security_group_id | ||
} | ||
|
||
output "region" { | ||
description = "AWS region" | ||
value = var.region | ||
} | ||
|
||
output "cluster_name" { | ||
description = "Kubernetes Cluster Name" | ||
value = module.eks.cluster_name | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
terraform { | ||
required_providers { | ||
aws = { | ||
source = "hashicorp/aws" | ||
version = "~> 5.49.0" | ||
} | ||
kubernetes = { | ||
source = "hashicorp/kubernetes" | ||
version = "2.33.0" | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
variable "region" { | ||
description = "AWS region" | ||
type = string | ||
default = "eu-west-1" | ||
} | ||
|
||
variable "cluster_name" { | ||
description = "EKS cluster name" | ||
type = string | ||
default = null | ||
} | ||
|
||
variable "cluster_version" { | ||
description = "EKS cluster version" | ||
type = string | ||
default = "1.31" | ||
} | ||
|
||
variable "instance_types" { | ||
description = "EC2 instance types" | ||
type = list | ||
default = ["t3.medium"] | ||
} | ||
|
||
variable "use_custom_launch_template" { | ||
description = "Disk size in GiB for nodes." | ||
type = bool | ||
default = true | ||
} | ||
|
||
variable "disk_size" { | ||
description = "Disk size in GiB for nodes." | ||
type = number | ||
default = 20 | ||
} | ||
|
||
variable "capacity_type" { | ||
description = "EC2 spot or on-demad instance types" | ||
type = string | ||
default = "ON_DEMAND" | ||
} | ||
|
||
variable "min_size" { | ||
description = "min size" | ||
type = number | ||
default = 1 | ||
} | ||
|
||
variable "max_size" { | ||
description = "max size" | ||
type = number | ||
default = 10 | ||
} | ||
|
||
variable "desired_size" { | ||
description = "desired size" | ||
type = number | ||
default = 1 | ||
} |