Skip to content

Commit

Permalink
[Finetune] use base model mpt-7b instead of mpt-7b-chat (#181)
Browse files Browse the repository at this point in the history
* use base model mpt-7b instead of mpt-7b-chat

Signed-off-by: minmingzhu <[email protected]>

* manual setting specify tokenizer

Signed-off-by: minmingzhu <[email protected]>

* update

Signed-off-by: minmingzhu <[email protected]>

* update doc/finetune_parameters.md

Signed-off-by: minmingzhu <[email protected]>

---------

Signed-off-by: minmingzhu <[email protected]>
  • Loading branch information
minmingzhu authored Apr 10, 2024
1 parent bb79869 commit 9182907
Show file tree
Hide file tree
Showing 6 changed files with 13 additions and 6 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/night_build_memo.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b-chat, huggyllama/llama-7b
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b, huggyllama/llama-7b
6 changes: 3 additions & 3 deletions .github/workflows/workflow_finetune.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
name: finetune
strategy:
matrix:
model: [ EleutherAI/gpt-j-6b, meta-llama/Llama-2-7b-chat-hf, gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b-chat, meta-llama/Llama-2-7b-hf, mistralai/Mistral-7B-v0.1, google/gemma-2b]
model: [ EleutherAI/gpt-j-6b, meta-llama/Llama-2-7b-chat-hf, gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b, meta-llama/Llama-2-7b-hf, mistralai/Mistral-7B-v0.1, google/gemma-2b]
isPR:
- ${{inputs.ci_type == 'pr'}}

Expand Down Expand Up @@ -92,7 +92,7 @@ jobs:
with open(conf_path, encoding="utf-8") as reader:
result = yaml.load(reader, Loader=yaml.FullLoader)
result['General']['base_model'] = "${{ matrix.model }}"
if "${{ matrix.model }}" == "mosaicml/mpt-7b-chat":
if "${{ matrix.model }}" == "mosaicml/mpt-7b":
result['General']['config']['trust_remote_code'] = True
else:
result['General']['config']['trust_remote_code'] = False
Expand Down Expand Up @@ -147,7 +147,7 @@ jobs:
- name: Run Deltatuner Test on DENAS-LoRA Model
run: |
if [[ ${{ matrix.model }} =~ ^(mosaicml\/mpt-7b-chat|huggyllama\/llama-7b|meta-llama\/Llama-2-7b-chat-hf|mistralai\/Mistral-7B-v0.1|google\/gemma-2b)$ ]]; then
if [[ ${{ matrix.model }} =~ ^(mosaicml\/mpt-7b|huggyllama\/llama-7b|meta-llama\/Llama-2-7b-chat-hf|mistralai\/Mistral-7B-v0.1|google\/gemma-2b)$ ]]; then
echo ${{ matrix.model }} is not supported!
else
docker exec "finetune" bash -c "rm -rf /tmp/llm-ray/*"
Expand Down
1 change: 1 addition & 0 deletions docs/finetune_parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The following are the parameters supported in the finetuning workflow.
|Configuration Name| Default|Meaning|
|-|-|-|
|base_model| EleutherAI/gpt-j-6b|Path to pretrained model or model identifier from huggingface.co/models|
|tokenizer_name|None|Path to pretrained tokenizer from huggingface.co/models. If not provided, the tokenizer will be loaded from the `base_model`.|
|gpt_base_model|True|This parameter is for [Transformers#22482](https://github.com/huggingface/transformers/issues/22482). It needs to be set to True when the pretrained model is realted to gpt, otherwise it is False.|
|output_dir|/tmp/llm-ray/output|The output directory to store the finetuned model|
|checkpoint_dir|/tmp/llm-ray/checkpoint|The directory to store checkpoint|
Expand Down
6 changes: 5 additions & 1 deletion llm_on_ray/finetune/finetune.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ def train_func(config: Dict[str, Any]):

gradient_accumulation_steps = config["Training"].get("gradient_accumulation_steps", 1)
base_model = config["General"]["base_model"]
if config["General"].get("tokenizer_name") is not None:
tokenizer_name = config["General"].get("tokenizer_name")
else:
tokenizer_name = base_model
dataset_file = config["Dataset"]["train_file"]

seed = config["Training"].get("seed")
Expand All @@ -171,7 +175,7 @@ def train_func(config: Dict[str, Any]):

tokenizer = common.tokenizer.Tokenizer.registory.get("HuggingFaceTokenizer")()(
config={
"name": base_model,
"name": tokenizer_name,
"config": config["General"]["config"],
}
)
Expand Down
1 change: 1 addition & 0 deletions llm_on_ray/finetune/finetune_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ class DeltatunerConfig(BaseModel):

class General(BaseModel):
base_model: str
tokenizer_name: Optional[str] = None
gpt_base_model: bool
output_dir: str
checkpoint_dir: Optional[str]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
General:
base_model: mosaicml/mpt-7b-chat
base_model: mosaicml/mpt-7b
tokenizer_name: EleutherAI/gpt-neox-20b
gpt_base_model: false
output_dir: /tmp/llm-ray/output
checkpoint_dir: /tmp/llm-ray/checkpoint
Expand Down

0 comments on commit 9182907

Please sign in to comment.