Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docsum according to docker compose changes #590

Merged
merged 1 commit into from
Nov 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/scripts/e2e/gmc_gaudi_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -637,7 +637,7 @@ function validate_docsum() {
export CLIENT_POD=$(kubectl get pod -n $DOCSUM_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
echo "$CLIENT_POD"
accessUrl=$(kubectl get gmc -n $DOCSUM_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='docsum')].status.accessUrl}")
kubectl exec "$CLIENT_POD" -n $DOCSUM_NAMESPACE -- curl $accessUrl -X POST -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_docsum.log
kubectl exec "$CLIENT_POD" -n $DOCSUM_NAMESPACE -- curl $accessUrl -X POST -d '{"type": "text", "messages":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_docsum.log
exit_code=$?
if [ $exit_code -ne 0 ]; then
echo "docsum failed, please check the logs in ${LOG_PATH}!"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/scripts/e2e/gmc_xeon_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -658,7 +658,7 @@ function validate_docsum() {
export CLIENT_POD=$(kubectl get pod -n $DOCSUM_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
echo "$CLIENT_POD"
accessUrl=$(kubectl get gmc -n $DOCSUM_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='docsum')].status.accessUrl}")
kubectl exec "$CLIENT_POD" -n $DOCSUM_NAMESPACE -- curl $accessUrl -X POST -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_docsum.log
kubectl exec "$CLIENT_POD" -n $DOCSUM_NAMESPACE -- curl $accessUrl -X POST -d '{"type": "text", "messages":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_docsum.log
exit_code=$?
if [ $exit_code -ne 0 ]; then
echo "docsum failed, please check the logs in ${LOG_PATH}!"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/scripts/e2e/manifest_gaudi_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ function validate_docsum() {
# Curl the DocSum LLM Service
curl http://${ip_address}:${port}/v1/chat/docsum \
-X POST \
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-d '{"type": "text", "messages":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-H 'Content-Type: application/json' > $LOG_PATH/curl_docsum.log
exit_code=$?
if [ $exit_code -ne 0 ]; then
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/scripts/e2e/manifest_xeon_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ function validate_docsum() {
# Curl the DocSum LLM Service
curl http://${ip_address}:${port}/v1/chat/docsum \
-X POST \
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-d '{"type", "text", "messages":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-H 'Content-Type: application/json' > $LOG_PATH/curl_docsum.log
exit_code=$?
if [ $exit_code -ne 0 ]; then
Expand Down
1 change: 1 addition & 0 deletions helm-charts/common/ui/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ data:
BASE_URL: {{ .Values.BACKEND_SERVICE_ENDPOINT | quote }}
{{- else if (contains "docsum-ui" .Values.image.repository) }}
DOC_BASE_URL: {{ .Values.BACKEND_SERVICE_ENDPOINT | quote }}
BACKEND_SERVICE_ENDPOINT: {{ .Values.BACKEND_SERVICE_ENDPOINT | quote }}
{{- else if (contains "docsum-react-ui" .Values.image.repository) }}
VITE_DOC_SUM_URL: {{ .Values.BACKEND_SERVICE_ENDPOINT | quote }}
{{- else if contains "chatqna-ui" .Values.image.repository }}
Expand Down
3 changes: 3 additions & 0 deletions helm-charts/docsum/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ dependencies:
- name: llm-uservice
version: 1.0.0
repository: "file://../common/llm-uservice"
- name: whisper
version: 1.0.0
repository: "file://../common/whisper"
- name: ui
version: 1.0.0
repository: "file://../common/ui"
Expand Down
6 changes: 4 additions & 2 deletions helm-charts/docsum/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,10 @@ Open another terminal and run the following command to verify the service if wor

```console
curl http://localhost:8888/v1/docsum \
-H 'Content-Type: application/json' \
-d '{"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}'
-H 'Content-Type: multipart/form-data' \
-F "type=text" \
-F "messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." \
-F "max_tokens=32"
```

### Verify the workload through UI
Expand Down
4 changes: 2 additions & 2 deletions helm-charts/docsum/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ spec:
{{- else }}
value: {{ .Release.Name }}-llm-uservice
{{- end }}
#- name: MEGA_SERVICE_PORT
# value: {{ .Values.port }}
- name: DATA_SERVICE_HOST_IP
value: {{ .Release.Name }}-m2t
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
Expand Down
94 changes: 94 additions & 0 deletions helm-charts/docsum/templates/m2t.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-m2t
labels:
{{- include "docsum.labels" . | nindent 4 }}
app: {{ .Release.Name }}-m2t
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "docsum.selectorLabels" . | nindent 6 }}
app: {{ .Release.Name }}-m2t
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "docsum.selectorLabels" . | nindent 8 }}
app: {{ .Release.Name }}-m2t
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Release.Name }}-m2t
env:
- name: V2A_ENDPOINT
value: {{ .Release.Name }}-v2a:{{ .Values.v2a.service.port }}
- name: A2T_ENDPOINT
value: {{ .Release.Name }}-whisper:{{ .Values.whisper.service.port }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.m2t.image.repository }}:{{ .Values.m2t.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: m2t
containerPort: {{ .Values.m2t.port }}
protocol: TCP
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.evenly_distributed }}
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
{{- include "docsum.selectorLabels" . | nindent 14 }}
app: {{ .Release.Name }}-m2t
{{- end }}
---
apiVersion: v1
kind: Service
metadata:
name: {{ .Release.Name }}-m2t
labels:
{{- include "docsum.labels" . | nindent 4 }}
spec:
type: {{ .Values.m2t.service.type }}
ports:
- port: {{ .Values.m2t.service.port }}
targetPort: {{ .Values.m2t.port }}
protocol: TCP
name: m2t
selector:
{{- include "docsum.selectorLabels" . | nindent 4 }}
app: {{ .Release.Name }}-m2t
1 change: 1 addition & 0 deletions helm-charts/docsum/templates/tests/test-pod.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ spec:
for ((i=1; i<=max_retry; i++)); do
curl http://{{ include "docsum.fullname" . }}:{{ .Values.service.port }}/v1/docsum -sS --fail-with-body \
-H 'Content-Type: multipart/form-data' \
-H "type=text" \
-F "messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." \
-F "max_tokens=32" && break;
curlcode=$?
Expand Down
89 changes: 89 additions & 0 deletions helm-charts/docsum/templates/v2a.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-v2a
labels:
{{- include "docsum.labels" . | nindent 4 }}
app: {{ .Release.Name }}-v2a
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "docsum.selectorLabels" . | nindent 6 }}
app: {{ .Release.Name }}-v2a
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "docsum.selectorLabels" . | nindent 8 }}
app: {{ .Release.Name }}-v2a
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Release.Name }}-v2a
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.v2a.image.repository }}:{{ .Values.v2a.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: v2a
containerPort: {{ .Values.v2a.port }}
protocol: TCP
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.evenly_distributed }}
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
{{- include "docsum.selectorLabels" . | nindent 14 }}
app: {{ .Release.Name }}-v2a
{{- end }}
---
apiVersion: v1
kind: Service
metadata:
name: {{ .Release.Name }}-v2a
labels:
{{- include "docsum.labels" . | nindent 4 }}
spec:
type: {{ .Values.v2a.service.type }}
ports:
- port: {{ .Values.v2a.service.port }}
targetPort: {{ .Values.v2a.port }}
protocol: TCP
name: v2a
selector:
{{- include "docsum.selectorLabels" . | nindent 4 }}
app: {{ .Release.Name }}-v2a
18 changes: 18 additions & 0 deletions helm-charts/docsum/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,24 @@ image:
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
tag: "latest"
v2a:
image:
repository: opea/dataprep-video2audio
# Overrides the image tag whose default is the chart appVersion.
tag: "latest"
port: 7078
service:
type: ClusterIP
port: 7078
m2t:
image:
repository: opea/dataprep-multimedia2text
# Overrides the image tag whose default is the chart appVersion.
tag: "latest"
port: 7079
service:
type: ClusterIP
port: 7079

port: 8888
service:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ metadata:
app.kubernetes.io/managed-by: Helm
data:
DOC_BASE_URL: "/v1/docsum"
BACKEND_SERVICE_ENDPOINT: "/v1/docsum"
---
# Source: ui/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
Expand Down
Loading