Skip to content

Commit

Permalink
Set num_workers to 1, #801 has landed (#370)
Browse files Browse the repository at this point in the history
  • Loading branch information
marius-baseten authored Oct 29, 2024
1 parent cd73c6a commit b595d5f
Show file tree
Hide file tree
Showing 7 changed files with 13 additions and 15 deletions.
2 changes: 1 addition & 1 deletion mistral/mistral-7b-instruct-chat-trt-llm-h100/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ resources:
accelerator: H100
use_gpu: true
runtime:
num_workers: 2
num_workers: 1
predict_concurrency: 256
secrets: {}
system_packages: []
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ resources:
accelerator: A100
use_gpu: true
runtime:
num_workers: 2
num_workers: 1
predict_concurrency: 256
secrets: {}
system_packages: []
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ resources:
accelerator: H100
use_gpu: true
runtime:
num_workers: 2
num_workers: 1
predict_concurrency: 256
secrets: {}
system_packages: []
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ resources:
accelerator: A100
use_gpu: true
runtime:
num_workers: 2
num_workers: 1
predict_concurrency: 256
secrets: {}
system_packages: []
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ resources:
accelerator: A100
use_gpu: true
runtime:
num_workers: 2
num_workers: 1
predict_concurrency: 256
secrets:
hf_access_token: "ENTER HF ACCESS TOKEN HERE"
Expand Down
2 changes: 1 addition & 1 deletion mistral/mixtral-8x7b-instruct-trt-llm/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ resources:
accelerator: A100:2
use_gpu: true
runtime:
num_workers: 2
num_workers: 1
predict_concurrency: 256
secrets: {}
system_packages: []
16 changes: 7 additions & 9 deletions templates/generate.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ mistral/mistral-7b-instruct-chat-trt-llm:
- transformers==4.34.1
runtime:
predict_concurrency: 256
num_workers: 2
num_workers: 1
ignore:
- README.md
template:
Expand Down Expand Up @@ -59,8 +59,7 @@ mistral/mistral-7b-instruct-chat-trt-llm-h100:
- jinja2==3.1.3
runtime:
predict_concurrency: 256
# Drop num_workers once basetenlabs/truss#801 lands
num_workers: 2
num_workers: 1
resources:
accelerator: H100
ignore:
Expand Down Expand Up @@ -96,8 +95,7 @@ mistral/mistral-7b-instruct-chat-trt-llm-weights-only-quant-h100:
- jinja2==3.1.3
runtime:
predict_concurrency: 256
# Drop num_workers once basetenlabs/truss#801 lands
num_workers: 2
num_workers: 1
resources:
accelerator: H100
ignore:
Expand Down Expand Up @@ -132,7 +130,7 @@ mistral/mistral-7b-instruct-chat-trt-llm-weights-only-quant:
- transformers==4.34.1
runtime:
predict_concurrency: 256
num_workers: 2
num_workers: 1
ignore:
- README.md
template:
Expand Down Expand Up @@ -165,7 +163,7 @@ mistral/mistral-7b-instruct-chat-trt-llm-smooth-quant:
- transformers==4.34.1
runtime:
predict_concurrency: 256
num_workers: 2
num_workers: 1
ignore:
- README.md
template:
Expand Down Expand Up @@ -198,7 +196,7 @@ mistral/mixtral-8x7b-instruct-trt-llm-weights-only-quant:
- transformers==4.36.0
runtime:
predict_concurrency: 256
num_workers: 2
num_workers: 1
ignore:
- README.md
template:
Expand Down Expand Up @@ -270,7 +268,7 @@ mistral/mixtral-8x7b-instruct-trt-llm:
- transformers==4.36.0
runtime:
predict_concurrency: 256
num_workers: 2
num_workers: 1
resources:
accelerator: A100:2
ignore:
Expand Down

0 comments on commit b595d5f

Please sign in to comment.