It did not work when I try to convert the default model "chatglm2" to "llama2" #64

maywind23 · 2023-08-29T03:16:23Z

Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.

I have changed the model as per your instructions, modifying model_name = "THUDM/chatglm2-6b" to model_name = "daryl149/llama-2-7b-chat-hf"
Then removed the device due to running error:

model = AutoModel.from_pretrained(
        model_name,
        quantization_config=q_config,
        trust_remote_code=True,
        token = access_token,
        # device='cuda'
    )

Changed the target_modules to llama:
target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']
Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer

writer = SummaryWriter()
trainer = ModifiedTrainer(
    model=model,
    args=training_args,             # Trainer args
    train_dataset=dataset["train"], # Training set
    eval_dataset=dataset["test"],   # Testing set
    data_collator=data_collator,    # Data Collator
    callbacks=[TensorBoardCallback(writer)],
)
trainer.train()
writer.close()
# save model
model.save_pretrained(training_args.output_dir)

The detail error as follows:

You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-d05cf508c134> in <cell line: 11>()
      9     callbacks=[TensorBoardCallback(writer)],
     10 )
---> 11 trainer.train()
     12 writer.close()
     13 # save model

6 frames
<ipython-input-25-26476d7038e4> in data_collator(features)
     37         ids = ids + [tokenizer.pad_token_id] * (longest - ids_l)
     38         _ids = torch.LongTensor(ids)
---> 39         labels_list.append(torch.LongTensor(labels))
     40         input_ids.append(_ids)
     41     input_ids = torch.stack(input_ids)

TypeError: 'NoneType' object cannot be interpreted as an integer

Could you please do me a favor resolving this issue? Looking forward to your reply!
(Platform: A100 on Google Colab)

The text was updated successfully, but these errors were encountered:

rajendrac3 · 2023-09-15T07:23:01Z

I am also facing the same issue.
I am running on AWS g5.8xlarge
Were you able to solve it?

rajendrac3 · 2023-09-18T12:55:14Z

Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None.
When this tokenizer is used to calculate the value of 'labels' it gives the error
TypeError: 'NoneType' object cannot be interpreted as an integer

I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved.
But got another error Llama.forward() got an unexpected keyword argument 'labels'

To resolve this I replaced

model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )

with
model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )

and it worked fine

IshchenkoRoman · 2024-02-23T12:44:43Z

TL;DR: add tokenizer.pad_token_id = 0 in your code

Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json we can see next lines:

  "pad_token": "<unk>",
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }

So as pad_token we can use id of unk_token which is equal to 0. To solve the problem with None, initialise field pad_token_id in tokenizer with value 0: tokenizer.pad_token_id = 0

YangletLiu added the bug Something isn't working label Aug 29, 2023

YangletLiu assigned oliverwang15 Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It did not work when I try to convert the default model "chatglm2" to "llama2" #64

It did not work when I try to convert the default model "chatglm2" to "llama2" #64

maywind23 commented Aug 29, 2023

rajendrac3 commented Sep 15, 2023

rajendrac3 commented Sep 18, 2023

IshchenkoRoman commented Feb 23, 2024 •

edited

Loading

It did not work when I try to convert the default model "chatglm2" to "llama2" #64

It did not work when I try to convert the default model "chatglm2" to "llama2" #64

Comments

maywind23 commented Aug 29, 2023

rajendrac3 commented Sep 15, 2023

rajendrac3 commented Sep 18, 2023

IshchenkoRoman commented Feb 23, 2024 • edited Loading

IshchenkoRoman commented Feb 23, 2024 •

edited

Loading