Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It did not work when I try to convert the default model "chatglm2" to "llama2" #64

Open
maywind23 opened this issue Aug 29, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@maywind23
Copy link

Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.

  • I have changed the model as per your instructions, modifying model_name = "THUDM/chatglm2-6b" to model_name = "daryl149/llama-2-7b-chat-hf"

  • Then removed the device due to running error:

model = AutoModel.from_pretrained(
        model_name,
        quantization_config=q_config,
        trust_remote_code=True,
        token = access_token,
        # device='cuda'
    )
  • Changed the target_modules to llama:
    target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']

  • Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer

writer = SummaryWriter()
trainer = ModifiedTrainer(
    model=model,
    args=training_args,             # Trainer args
    train_dataset=dataset["train"], # Training set
    eval_dataset=dataset["test"],   # Testing set
    data_collator=data_collator,    # Data Collator
    callbacks=[TensorBoardCallback(writer)],
)
trainer.train()
writer.close()
# save model
model.save_pretrained(training_args.output_dir)

The detail error as follows:

You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-d05cf508c134> in <cell line: 11>()
      9     callbacks=[TensorBoardCallback(writer)],
     10 )
---> 11 trainer.train()
     12 writer.close()
     13 # save model

6 frames
<ipython-input-25-26476d7038e4> in data_collator(features)
     37         ids = ids + [tokenizer.pad_token_id] * (longest - ids_l)
     38         _ids = torch.LongTensor(ids)
---> 39         labels_list.append(torch.LongTensor(labels))
     40         input_ids.append(_ids)
     41     input_ids = torch.stack(input_ids)

TypeError: 'NoneType' object cannot be interpreted as an integer

Could you please do me a favor resolving this issue? Looking forward to your reply!
(Platform: A100 on Google Colab)

@YangletLiu YangletLiu added the bug Something isn't working label Aug 29, 2023
@rajendrac3
Copy link

I am also facing the same issue.
I am running on AWS g5.8xlarge
Were you able to solve it?

@rajendrac3
Copy link

Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None.
When this tokenizer is used to calculate the value of 'labels' it gives the error
TypeError: 'NoneType' object cannot be interpreted as an integer

I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved.
But got another error Llama.forward() got an unexpected keyword argument 'labels'

To resolve this I replaced

model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )

with
model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )

and it worked fine

@IshchenkoRoman
Copy link

IshchenkoRoman commented Feb 23, 2024

TL;DR: add tokenizer.pad_token_id = 0 in your code

Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json we can see next lines:

  "pad_token": "<unk>",
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }

So as pad_token we can use id of unk_token which is equal to 0. To solve the problem with None, initialise field pad_token_id in tokenizer with value 0: tokenizer.pad_token_id = 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants