[BUG] size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint #834

Chitran-0249f · 2024-12-24T08:07:55Z

Prerequisites

I have read the documentation.
I have checked other issues for similar problems.

Backend

Local

Interface Used

CLI

CLI Command

No response

UI Screenshots & Parameters

Error while fine tuning

Fine tuning using these libraries on local A6000 GPU setup on llama2 7b model.
encountered this error
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (input_ids in this case) have excessive nesting (inputs type list where type int is expected).

to solve it, added a special pad_token,

tokenizer.add_special_tokens({"pad_token": "<pad>"})
tokenizer.padding_side = 'left'

ERROR while inference

adding the pad_token is causing a size mismatch
size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

Error Logs

ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'}).

Additional Information

No response

The text was updated successfully, but these errors were encountered:

Chitran-0249f added the bug Something isn't working label Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint #834

[BUG] size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint #834

Chitran-0249f commented Dec 24, 2024 •

edited

Loading

[BUG] size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint #834

[BUG] size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint #834

Comments

Chitran-0249f commented Dec 24, 2024 • edited Loading

Prerequisites

Backend

Interface Used

CLI Command

UI Screenshots & Parameters

Error Logs

Additional Information

Chitran-0249f commented Dec 24, 2024 •

edited

Loading