Add HF model support inc. DS-R1-Distill, Qwen needs yarn support #17421

yieldthought · 2025-01-31T13:35:25Z

Problem description

Existing codebase loads the meta checkpoint format but many derivative models are only available on huggingface.

What's changed

Add support for loading HuggingFace model formats, paving the way for full Qwen support (pending yarn rope implementation) and adding DeepSeek-R1-Distill-Llama-70B support.

Checklist

All passing locally.

cglagovichTT

bravo sir

models/demos/llama3/requirements.txt

models/demos/llama3/tests/reference_outputs/Qwen2.5-7B-Instruct.refpt

models/demos/llama3/tests/test_ref.py

models/demos/llama3/tt/llama_attention.py

mtairum · 2025-01-31T17:08:59Z

@yieldthought Re-generated all Llama3 cache files in CI for N150 / N300 / T3K.

TG will need to be regenerated at a later date through CI.

Re-running all pipelines. The T3K old one was not even building correctly.

models/demos/llama3/run_tests.sh

models/demos/llama3/tests/generate_reference_outputs.sh

models/demos/llama3/tt/llama_attention.py

yieldthought · 2025-01-31T18:02:34Z

Ready to merge when tests pass

models/demos/llama3/tt/multimodal/llama_vision_model.py

README.md

mtairum · 2025-02-04T18:40:00Z

All passing locally. Running the latest CI pipelines here.

If they pass we're good to merge.

…to hf-llama

This reverts commit bd491a2.

…to hf-llama

mtairum · 2025-02-06T10:53:36Z

Updated the issues on the description. Investigating the remaining ones that consistently fail.

yieldthought requested review from cglagovichTT, mtairum and uaydonat as code owners January 31, 2025 13:35

cglagovichTT approved these changes Jan 31, 2025

View reviewed changes

mtairum reviewed Jan 31, 2025

View reviewed changes

models/demos/llama3/run_tests.sh Outdated Show resolved Hide resolved

models/demos/llama3/tests/generate_reference_outputs.sh Outdated Show resolved Hide resolved

mtairum approved these changes Jan 31, 2025

View reviewed changes

models/demos/llama3/tt/llama_attention.py Outdated Show resolved Hide resolved

mtairum mentioned this pull request Jan 31, 2025

Add Qwen2.5 vLLM generator (based on LlamaGenerator), fix batch 1 issue with generator's decode forward #17422

Merged

6 tasks

skhorasganiTT reviewed Jan 31, 2025

View reviewed changes

models/demos/llama3/tt/multimodal/llama_vision_model.py Outdated Show resolved Hide resolved

skhorasganiTT mentioned this pull request Jan 31, 2025

Add DeepSeek-R1-Distill-Llama-70B to list of supported TT models tenstorrent/vllm#59

Merged

mtairum mentioned this pull request Feb 3, 2025

#0: Update Llama3 PERF.md and llama3 vision PCC after cache regeneration #17505

Merged

skhorasganiTT reviewed Feb 3, 2025

View reviewed changes

README.md Show resolved Hide resolved

yieldthought and others added 17 commits February 4, 2025 03:23

#0: Add HF model support inc. DS-R1-Distill, Qwen needs yarn support

f16628a

#0: Work around missing precommit black

e1bd34b

#0: where black

13a2456

#0: WIP

77bb120

#0: Run black

97bdd3b

#0: Add hf checkpoint loading

6d8db5e

#0: move

e695345

#0: HF model instructions

f3a47b4

#0: DeepSeek support

53765f0

#0: TG and licensing

52d1e84

#0: TG updates

12f0b58

#0: Updated DeepSeek perf figures

a727d12

#0: Fix dummy weight issue

f8b7387

#0: Remove comments

fb38643

#0: Remove file

0be745b

#0: Remove commented code

5acd69e

#0: Change default tests

8ec59ca

mtairum and others added 5 commits February 4, 2025 09:31

#0: Update PERF.md

0d11295

#0: Re-enable async on mlp test

2621ecb

#0: Fix attention bias

1cf87e4

Merge branch 'hf-llama' of github.com:tenstorrent/tt-metal into hf-llama

9f14ae6

#0: Fix MLP W3 kernel config

aac80d1

yieldthought and others added 16 commits February 5, 2025 11:03

#0: Fix theta usage in vision model

a6f3046

Merge branch 'hf-llama' of github.com:tenstorrent/tt-metal into hf-llama

b89ac41

#0: DeepSeek R1 now on main with this PR

f45ef9d

#0: Fix test_llama_cross_attention_transformer_text

a9e0fb2

Merge branch 'hf-llama' of https://github.com/tenstorrent/tt-metal in…

40cefa1

…to hf-llama

#0: Tidy up use_scaled_rope, update PERF with Qwen

c97c7bc

#0: Get name from path if _name_or_path not in config.json

9533f68

Merge branch 'hf-llama' of github.com:tenstorrent/tt-metal into hf-llama

243e732

#0: Skip reasoning test on CI

4778ab9

#0: Update reference model

bd491a2

Merge branch 'hf-llama' of github.com:tenstorrent/tt-metal into hf-llama

37e9769

#0: Add deepseek r1 llama-70b to lt

7094929

#0: Backward-compatibility with reference repo

146f1d8

Revert "#0: Update reference model"

d650d26

This reverts commit bd491a2.

#0: Update lt to support vision command and deepseek r1-70b-llama

6d0a788

Merge branch 'hf-llama' of https://github.com/tenstorrent/tt-metal in…

b6ecf68

…to hf-llama

yieldthought and others added 4 commits February 6, 2025 11:32

#0: Attention bias fix for qwen, mlp cache name disambiguation

638f16f

Merge branch 'hf-llama' of github.com:tenstorrent/tt-metal into hf-llama

c108221

#0: Don't load tokenizer if we have no folder

70c7570

#0: Update perf.md for the latest main

53dae6a

mtairum mentioned this pull request Feb 7, 2025

Fix non-deterministic hangs caused by MeshDevice trace replay #17696

Open

2 tasks

yieldthought merged commit d0b59bd into main Feb 7, 2025
11 checks passed

yieldthought deleted the hf-llama branch February 7, 2025 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HF model support inc. DS-R1-Distill, Qwen needs yarn support #17421

Add HF model support inc. DS-R1-Distill, Qwen needs yarn support #17421

yieldthought commented Jan 31, 2025 •

edited by mtairum

Loading

cglagovichTT left a comment

mtairum commented Jan 31, 2025

yieldthought commented Jan 31, 2025

mtairum commented Feb 4, 2025 •

edited

Loading

mtairum commented Feb 6, 2025

Add HF model support inc. DS-R1-Distill, Qwen needs yarn support #17421

Add HF model support inc. DS-R1-Distill, Qwen needs yarn support #17421

Conversation

yieldthought commented Jan 31, 2025 • edited by mtairum Loading

Problem description

What's changed

Checklist

cglagovichTT left a comment

Choose a reason for hiding this comment

mtairum commented Jan 31, 2025

yieldthought commented Jan 31, 2025

mtairum commented Feb 4, 2025 • edited Loading

mtairum commented Feb 6, 2025

yieldthought commented Jan 31, 2025 •

edited by mtairum

Loading

mtairum commented Feb 4, 2025 •

edited

Loading