[Inference] Fix auth token and add models starcoder and llama2 #39

Deegue · 2024-01-08T08:47:39Z

No description provided.

* add num_to_keep for pretrainers Signed-off-by: Zhi Lin <[email protected]> * add num_to_keep to config Signed-off-by: Zhi Lin <[email protected]> --------- Signed-off-by: Zhi Lin <[email protected]>

KepingYan · 2024-01-09T07:08:59Z

Why is env HF_ACCESS_TOKEN removed, is it no longer needed?

inference/models/starcoder.yaml

KepingYan · 2024-01-10T06:01:50Z

Please also help add **config.dict() at PeftModel.from_pretrained(model, model_desc.peft_model_id_or_path) and DeltaTunerModel.from_pretrained(model, model_desc.peft_model_id_or_path) in both transformer_predictor.py and deepspeed_predictor.py.

Deegue · 2024-01-10T08:56:05Z

Please also help add **config.dict() at PeftModel.from_pretrained(model, model_desc.peft_model_id_or_path) and DeltaTunerModel.from_pretrained(model, model_desc.peft_model_id_or_path) in both transformer_predictor.py and deepspeed_predictor.py.

IIUC, maybe we should add like L34 instead of dict() right?

KepingYan · 2024-01-10T08:58:40Z

IIUC, maybe we should add like L34 instead of dict() right?

OK. Thanks.

Signed-off-by: Yizhong Zhang <[email protected]>

…into add_starcoder

Signed-off-by: Yizhong Zhang <[email protected]>

Deegue · 2024-01-17T01:39:46Z

Gentle ping @KepingYan for another review since all CI passed.
Will remove models "starcoder" and "llama-2-7b-chat-hf" before it is ready to merge.

KepingYan · 2024-01-17T02:07:32Z

inference/models/starcoder.yaml

+    bot_id: ''
+    stop_words: []
+  config:
+    use_auth_token: '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'


use_auth_token cannot be written directly in config yaml, it needs to be set in CI file. @jiafuzha please help confirm this.

yes, strictly speaking. we cannot. But it's only read-only key. If it can pass github security check, I think we can leave it for now.

If the environment of our CI nodes have env.HF_ACCESS_TOKEN configured. I think I can try to get use_auth_token runtime from env instead of passing in yaml directly.

yes, it's better.

Maybe we can have a unit test later to verify if use_auth_token is passed correctly. This ticket exposed and fixed the token bug in several places, which shows the right value of CI.

Ok, removed them from CI. Let's merge this first and I will create a follow-up PR to hide auth token.

Signed-off-by: Yizhong Zhang <[email protected]>

…into add_starcoder

Deegue · 2024-01-19T05:35:11Z

All tests passed, auth token is removed from yaml. Removed starcoder and llama2 from CI. Ping @KepingYan @jiafuzha for review, thanks!

jiafuzha · 2024-01-19T05:47:24Z

All tests passed, auth token is removed from yaml. Removed starcoder and llama2 from CI. Ping @KepingYan @jiafuzha for review, thanks!

how is auth_token passed to CI? huggingface-cli login?

Deegue · 2024-01-19T18:42:48Z

All tests passed, auth token is removed from yaml. Removed starcoder and llama2 from CI. Ping @KepingYan @jiafuzha for review, thanks!

how is auth_token passed to CI? huggingface-cli login?

I used the environment variables env.HF_ACCESS_TOKEN to get and rewrite the yaml, to pass the auto token.

llm-on-ray/.github/workflows/workflow_inference.yml

Lines 112 to 130 in a3be1cd

    
                     CMD=$(cat << EOF 
        
                     import yaml 
        
                     if ("${{ matrix.model }}" == "starcoder"): 
        
                         conf_path = "inference/models/starcoder.yaml" 
        
                         with open(conf_path, encoding="utf-8") as reader: 
        
                             result = yaml.load(reader, Loader=yaml.FullLoader) 
        
                             result['model_description']["config"]["use_auth_token"] = "${{ env.HF_ACCESS_TOKEN }}" 
        
                         with open(conf_path, 'w') as output: 
        
                             yaml.dump(result, output, sort_keys=False) 
        
                     if ("${{ matrix.model }}" == "llama-2-7b-chat-hf"): 
        
                         conf_path = "inference/models/llama-2-7b-chat-hf.yaml" 
        
                         with open(conf_path, encoding="utf-8") as reader: 
        
                             result = yaml.load(reader, Loader=yaml.FullLoader) 
        
                             result['model_description']["config"]["use_auth_token"] = "${{ env.HF_ACCESS_TOKEN }}" 
        
                         with open(conf_path, 'w') as output: 
        
                             yaml.dump(result, output, sort_keys=False) 
        
                     EOF 
        
                     ) 
        
                     docker exec "${TARGET}" python -c "$CMD"

Deegue · 2024-01-23T04:09:04Z

Hi @KepingYan @jiafuzha , could you take a second look whether there are further comments?

KepingYan · 2024-01-23T07:44:05Z

Environment variable doesn’t seem to take effect, please add this step and try again.

llm-on-ray/.github/workflows/workflow_finetune.yml

Lines 66 to 67 in 6e32361

    
           - name: Load environment variables 
        
             run: cat /root/actions-runner-config/.env >> $GITHUB_ENV

Deegue · 2024-01-23T14:44:36Z

Environment variable doesn’t seem to take effect, please add this step and try again.

llm-on-ray/.github/workflows/workflow_finetune.yml

Lines 66 to 67 in 6e32361

- name: Load environment variables

run: cat /root/actions-runner-config/.env >> $GITHUB_ENV

Thanks @KepingYan . It seems no env file on our CI nodes.

…into add_starcoder

Deegue · 2024-02-05T05:58:11Z

All test passed, removed starcoder and llama-2-7b-chat-hf from CI. Ping @KepingYan for review.

KepingYan

LGTM

Deegue added 4 commits January 8, 2024 08:00

add starcoder and enable llama2

dc4895f

nit

cd1a0ef

nit

dacbab3

revert

35e4288

Deegue added 2 commits January 9, 2024 06:31

add token

e73cf55

dedup

f809782

KepingYan reviewed Jan 9, 2024

View reviewed changes

inference/models/starcoder.yaml Outdated Show resolved Hide resolved

Deegue added 3 commits January 9, 2024 08:16

add token to from_pretrained

5a55e87

pass auth token to from_pretrained

7c2f004

nit

1c48886

Deegue added 9 commits January 10, 2024 16:49

add auth tokens

d2651f8

Merge branch 'main' into add_starcoder

9ee82ff

Signed-off-by: Yizhong Zhang <[email protected]>

lint

9f552ba

Merge branch 'add_starcoder' of https://github.com/Deegue/llm-on-ray …

462164e

…into add_starcoder

fix lint

562913e

Merge branch 'main' into add_starcoder

2d3f7c6

Signed-off-by: Yizhong Zhang <[email protected]>

nit

836b7f4

deepspeed not support starcoder

0cb47c7

nit

23e8b63

KepingYan reviewed Jan 17, 2024

View reviewed changes

remove from ci

77cedb1

Deegue changed the title ~~[Inference] Add model starcoder and enable llama2~~ [Inference] Fix auth token and add models starcoder and llama2 Jan 17, 2024

Deegue added 2 commits January 19, 2024 02:15

remove direct auth token

85cb34d

add back ci workflow temporarily

8ebbcad

Deegue added 3 commits January 19, 2024 11:02

Merge branch 'main' into add_starcoder

fee7f30

Signed-off-by: Yizhong Zhang <[email protected]>

Merge branch 'add_starcoder' of https://github.com/Deegue/llm-on-ray …

ea9e0cc

…into add_starcoder

remove from ci

a3be1cd

add load environment and enable 2 models again

9b7bff6

Deegue added 8 commits January 24, 2024 02:06

add dir

1ede3bb

add load environment and enable 2 models again

469acb5

Merge branch 'add_starcoder' of https://github.com/Deegue/llm-on-ray …

2e3b4e2

…into add_starcoder

change proxy

7356b39

revert proxy

f171099

change proxy

5ff00ee

revert proxy

c8a53e2

remove 2 models from ci

b1e78ad

KepingYan approved these changes Feb 6, 2024

View reviewed changes

Deegue merged commit 6d72097 into intel:main Feb 7, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference] Fix auth token and add models starcoder and llama2 #39

[Inference] Fix auth token and add models starcoder and llama2 #39

Deegue commented Jan 8, 2024

KepingYan commented Jan 9, 2024

KepingYan commented Jan 10, 2024

Deegue commented Jan 10, 2024

KepingYan commented Jan 10, 2024

Deegue commented Jan 17, 2024

KepingYan Jan 17, 2024

jiafuzha Jan 17, 2024

Deegue Jan 17, 2024

jiafuzha Jan 17, 2024

Deegue Jan 17, 2024

Deegue commented Jan 19, 2024 •

edited

Loading

jiafuzha commented Jan 19, 2024

Deegue commented Jan 19, 2024 •

edited

Loading

Deegue commented Jan 23, 2024

KepingYan commented Jan 23, 2024

Deegue commented Jan 23, 2024

Deegue commented Feb 5, 2024

KepingYan left a comment

[Inference] Fix auth token and add models starcoder and llama2 #39

[Inference] Fix auth token and add models starcoder and llama2 #39

Conversation

Deegue commented Jan 8, 2024

KepingYan commented Jan 9, 2024

KepingYan commented Jan 10, 2024

Deegue commented Jan 10, 2024

KepingYan commented Jan 10, 2024

Deegue commented Jan 17, 2024

KepingYan Jan 17, 2024

Choose a reason for hiding this comment

jiafuzha Jan 17, 2024

Choose a reason for hiding this comment

Deegue Jan 17, 2024

Choose a reason for hiding this comment

jiafuzha Jan 17, 2024

Choose a reason for hiding this comment

Deegue Jan 17, 2024

Choose a reason for hiding this comment

Deegue commented Jan 19, 2024 • edited Loading

jiafuzha commented Jan 19, 2024

Deegue commented Jan 19, 2024 • edited Loading

Deegue commented Jan 23, 2024

KepingYan commented Jan 23, 2024

Deegue commented Jan 23, 2024

Deegue commented Feb 5, 2024

KepingYan left a comment

Choose a reason for hiding this comment

Deegue commented Jan 19, 2024 •

edited

Loading

Deegue commented Jan 19, 2024 •

edited

Loading