Steps

Installation

Install requirements (Python 3.8)

pip install -r requirements.txt

Load models into bento

python load-model.py

Run prediction service

Create bento endpoint on port 3000

BENTOML_CONFIG=./cfg.yml bentoml serve --production

Or: Run containerized prediction service

Build the bento

$ bentoml build
Successfully built Bento(tag="emotion_recognition:zd2vyafvicgucaav").

Docker

Generate Docker image

$ bentoml containerize emotion_recognition:zd2vyafvicgucaav
Successfully built Bento container for "emotion_recognition:zd2vyafvicgucaav" with tag(s) "emotion_recognition:zd2vyafvicgucaav"

$ docker run -p 3000:3000 --rm -v $(pwd)/cfg.yml:/home/bentoml/configuration.yml \
             -e BENTOML_CONFIG=/home/bentoml/configuration.yml \
             emotion_recognition:zd2vyafvicgucaav serve --production
Starting production BentoServer from "emotion_recognition:zd2vyafvicgucaav" running on http://0.0.0.0:3000

Kubernetes

Install the Yatai components and follow their instructions to push the bento to Yatai

$ bentoml yatai login --api-token {YOUR_TOKEN} --endpoint http://127.0.0.1:8080
$ bentoml push emotion_recognition:zd2vyafvicgucaav
│ Successfully pushed model "emotion:uouhv7vvg2ka6aav"
│ Successfully pushed model "blazeface_back:uprkd2fvg2ka6aav"
│ Successfully pushed bento "emotion_recognition:zd2vyafvicgucaav"

Go to the deployments page: http://127.0.0.1:8080/deployments, click Create button and follow the instructions on the UI. Paste this into the BentoML configuration textfield

runners.timeout=1 runners.batching.enabled=true runners.batching.max_batch_size=100 runners.batching.max_latency_ms=60

Note: The yatai-deployment component script installs a metrics server, which is needed for auto-scaling. If the metric pod fails during installation, try adding the following to spec.template.spec.containers[].args in the metrics-server deployment of components.yaml and apply it manually.

- --kubelet-insecure-tls

Test

Run a simple test script. Send many requests at once python request.py {URL} {REQUEST_AMOUNT}, or send them one after another python request.py {URL} {REQUEST_AMOUNT} --one-by-one

$ cd tests && python request.py http://emotion-yatai.127.0.0.1.sslip.io/predict_async 2000
...
Response content: b'{"userId":345786,"conferenceId":890678,"clientFaceDetection":true,"emotions":[{"raw":{"neutral":0.05500756949186325,"happy":0.1018536314368248,"sad":0.16155335307121277,"surprise":0.09337092190980911,"fear":0.31956708431243896,"disgust":0.2149820476770401,"anger":0.04546269401907921,"contempt":0.008202659897506237},"dominantEmotion":"fear"}],"date":"2023-02-25T19:39:19.216+00:00","duration":0.063389253616333}'
Response code: 500
Response code: 503
...

Response code 500 usually means the request was cancelled because a Runner is too busy (timeout in config).
Response code 503 usually means bento removed the request from the adaptive batching queue of the emotion model, because it figured out that the inference time + queue wait time would be higher than max_latency_ms in config.
Response code 502 usually means a general resource issue, try adding more memory/cpu when using Yatai.

Starting the service without the BentoML config options timeout/max_latency_ms would greatly increase the latency of requests when the system is under load. This is because none of the requests would be cancelled.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
promotheus		promotheus
tests		tests
.gitignore		.gitignore
README.md		README.md
bentofile.yaml		bentofile.yaml
blazeface.py		blazeface.py
blazeface_back_converted.onnx		blazeface_back_converted.onnx
blazefaceback.pth		blazefaceback.pth
cfg.yml		cfg.yml
components.yaml		components.yaml
emotion.onnx		emotion.onnx
load-model.py		load-model.py
postprocessBlazeface.py		postprocessBlazeface.py
preprocessBlazeface.py		preprocessBlazeface.py
preprocessEmotion.py		preprocessEmotion.py
quick-install-yatai-deployment.sh		quick-install-yatai-deployment.sh
requirements.txt		requirements.txt
service.py		service.py
yatai_emotion_deployment.yml		yatai_emotion_deployment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Steps

Installation

Run prediction service

Or: Run containerized prediction service

Docker

Kubernetes

Test

About

Releases

Packages

Languages

ThomasFen/bento_emotion

Folders and files

Latest commit

History

Repository files navigation

Steps

Installation

Run prediction service

Or: Run containerized prediction service

Docker

Kubernetes

Test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages