You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.
I have changed the model as per your instructions, modifying model_name = "THUDM/chatglm2-6b" to model_name = "daryl149/llama-2-7b-chat-hf"
Changed the target_modules to llama: target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']
Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer
writer = SummaryWriter()
trainer = ModifiedTrainer(
model=model,
args=training_args, # Trainer args
train_dataset=dataset["train"], # Training set
eval_dataset=dataset["test"], # Testing set
data_collator=data_collator, # Data Collator
callbacks=[TensorBoardCallback(writer)],
)
trainer.train()
writer.close()
# save model
model.save_pretrained(training_args.output_dir)
The detail error as follows:
You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-d05cf508c134> in <cell line: 11>()
9 callbacks=[TensorBoardCallback(writer)],
10 )
---> 11 trainer.train()
12 writer.close()
13 # save model
6 frames
<ipython-input-25-26476d7038e4> in data_collator(features)
37 ids = ids + [tokenizer.pad_token_id] * (longest - ids_l)
38 _ids = torch.LongTensor(ids)
---> 39 labels_list.append(torch.LongTensor(labels))
40 input_ids.append(_ids)
41 input_ids = torch.stack(input_ids)
TypeError: 'NoneType' object cannot be interpreted as an integer
Could you please do me a favor resolving this issue? Looking forward to your reply!
(Platform: A100 on Google Colab)
The text was updated successfully, but these errors were encountered:
Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None.
When this tokenizer is used to calculate the value of 'labels' it gives the error TypeError: 'NoneType' object cannot be interpreted as an integer
I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved.
But got another error Llama.forward() got an unexpected keyword argument 'labels'
To resolve this I replaced
model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )
with model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )
TL;DR: add tokenizer.pad_token_id = 0 in your code
Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json we can see next lines:
So as pad_token we can use id of unk_token which is equal to 0. To solve the problem with None, initialise field pad_token_id in tokenizer with value 0: tokenizer.pad_token_id = 0
Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.
I have changed the model as per your instructions, modifying model_name = "THUDM/chatglm2-6b" to model_name = "daryl149/llama-2-7b-chat-hf"
Then removed the device due to running error:
Changed the target_modules to llama:
target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']
Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer
The detail error as follows:
Could you please do me a favor resolving this issue? Looking forward to your reply!
(Platform: A100 on Google Colab)
The text was updated successfully, but these errors were encountered: