Update pyproject.toml + tools Falcon 3 addition #402

michaelfeil · 2025-01-23T01:38:36Z

tools are super outdated
poetry pins truss==0.9.49 via git! Git still pins revisions!

Adds falcon:

falcon 3 in favor of falcon-40B

philipkiely-baseten

Looks good. Which config would you recommend for the library?

philipkiely-baseten · 2025-01-23T21:54:52Z

falcon/falcon3-10B-trt-llm-spec-dec/config.yaml

+    checkpoint_repository:
+      repo: tiiuae/Falcon3-10B-Instruct
+      source: HF
+    max_seq_len: 8192


Should this match the runtime tokens?

This should match: https://huggingface.co/tiiuae/Falcon3-1B-Instruct/blob/28ba2251970a01dd1edc7ba7dad2eb71216ccfdf/config.json#L13

philipkiely-baseten · 2025-01-23T21:55:05Z

falcon/falcon3-10B-trt-llm-spec-dec/config.yaml

+    enable_chunked_context: true
+    kv_cache_free_gpu_mem_fraction: 0.62
+    request_default_max_tokens: 1000
+    total_token_limit: 500000


Should this be a round number?

total_token_limit defines a briton setting for how many tokens are queued inside C++. If queueing too many requests, we overload runtime. Not important, as long as it's in a decent range.

total_token_limit = 500000 is default.

kv_cache_free_gpu_mem_fraction: 0.62 is a hard setting.
Falcon-3-10B needs around 20GB Vram for weights. We have 20GB Vram left.
Of the rest of the 20GB we allocate 62% to Falcon-10B. The rest of 7.6GB go to the 1B model. 2 GB out of the 7.6Gb will go to the weights, while 5.6GB are for activations and KV cache.

michaelfeil added 5 commits January 23, 2025 01:11

update truss-examples with falcon

42c5535

update pyproject toml

ecd05b6

remove truss from poetry

9642f90

no package mode

8629f8d

update falcon config

c9af253

philipkiely-baseten approved these changes Jan 23, 2025

View reviewed changes

philipkiely-baseten reviewed Jan 23, 2025

View reviewed changes

michaelfeil merged commit 22d8ada into main Jan 23, 2025
2 checks passed

michaelfeil deleted the falcon-3-addition branch January 23, 2025 22:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update pyproject.toml + tools Falcon 3 addition #402

Update pyproject.toml + tools Falcon 3 addition #402

michaelfeil commented Jan 23, 2025

philipkiely-baseten left a comment

philipkiely-baseten Jan 23, 2025

michaelfeil Jan 23, 2025

philipkiely-baseten Jan 23, 2025

michaelfeil Jan 23, 2025

Update pyproject.toml + tools Falcon 3 addition #402

Update pyproject.toml + tools Falcon 3 addition #402

Conversation

michaelfeil commented Jan 23, 2025

philipkiely-baseten left a comment

Choose a reason for hiding this comment

philipkiely-baseten Jan 23, 2025

Choose a reason for hiding this comment

michaelfeil Jan 23, 2025

Choose a reason for hiding this comment

philipkiely-baseten Jan 23, 2025

Choose a reason for hiding this comment

michaelfeil Jan 23, 2025

Choose a reason for hiding this comment