-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference] Fix auth token and add models starcoder and llama2 #39
Merged
Merged
Changes from 18 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
dc4895f
add starcoder and enable llama2
Deegue cd1a0ef
nit
Deegue dacbab3
nit
Deegue 35e4288
revert
Deegue e73cf55
add token
Deegue f809782
dedup
Deegue 5a55e87
add token to from_pretrained
Deegue 7c2f004
pass auth token to from_pretrained
Deegue 1c48886
nit
Deegue d2651f8
add auth tokens
Deegue 9ee82ff
Merge branch 'main' into add_starcoder
Deegue 9f552ba
lint
Deegue 462164e
Merge branch 'add_starcoder' of https://github.com/Deegue/llm-on-ray …
Deegue 562913e
fix lint
Deegue 2d3f7c6
Merge branch 'main' into add_starcoder
Deegue 836b7f4
nit
Deegue 0cb47c7
deepspeed not support starcoder
Deegue 23e8b63
nit
Deegue 77cedb1
remove from ci
Deegue 85cb34d
remove direct auth token
Deegue 8ebbcad
add back ci workflow temporarily
Deegue fee7f30
Merge branch 'main' into add_starcoder
Deegue ea9e0cc
Merge branch 'add_starcoder' of https://github.com/Deegue/llm-on-ray …
Deegue a3be1cd
remove from ci
Deegue 9b7bff6
add load environment and enable 2 models again
Deegue 1ede3bb
add dir
Deegue 469acb5
add load environment and enable 2 models again
Deegue 2e3b4e2
Merge branch 'add_starcoder' of https://github.com/Deegue/llm-on-ray …
Deegue 7356b39
change proxy
Deegue f171099
revert proxy
Deegue 5ff00ee
change proxy
Deegue c8a53e2
revert proxy
Deegue b1e78ad
remove 2 models from ci
Deegue File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
port: 8000 | ||
name: starcoder | ||
route_prefix: /starcoder | ||
cpus_per_worker: 24 | ||
gpus_per_worker: 0 | ||
deepspeed: false | ||
workers_per_group: 2 | ||
ipex: | ||
enabled: false | ||
precision: bf16 | ||
device: "cpu" | ||
model_description: | ||
model_id_or_path: bigcode/starcoder | ||
tokenizer_name_or_path: bigcode/starcoder | ||
chat_processor: ChatModelGptJ | ||
prompt: | ||
intro: '' | ||
human_id: '' | ||
bot_id: '' | ||
stop_words: [] | ||
config: | ||
use_auth_token: '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' | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use_auth_token
cannot be written directly in config yaml, it needs to be set in CI file. @jiafuzha please help confirm this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, strictly speaking. we cannot. But it's only read-only key. If it can pass github security check, I think we can leave it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the environment of our CI nodes have
env.HF_ACCESS_TOKEN
configured. I think I can try to getuse_auth_token
runtime from env instead of passing in yaml directly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's better.
Maybe we can have a unit test later to verify if use_auth_token is passed correctly. This ticket exposed and fixed the token bug in several places, which shows the right value of CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, removed them from CI. Let's merge this first and I will create a follow-up PR to hide auth token.