Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint #834

Open
2 tasks done
Chitran-0249f opened this issue Dec 24, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Chitran-0249f
Copy link

Chitran-0249f commented Dec 24, 2024

Prerequisites

  • I have read the documentation.
  • I have checked other issues for similar problems.

Backend

Local

Interface Used

CLI

CLI Command

No response

UI Screenshots & Parameters

Error while fine tuning
image
Fine tuning using these libraries on local A6000 GPU setup on llama2 7b model.
encountered this error
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (input_ids in this case) have excessive nesting (inputs type list where type int is expected).

to solve it, added a special pad_token,

tokenizer.add_special_tokens({"pad_token": "<pad>"})
tokenizer.padding_side = 'left'

ERROR while inference
image
adding the pad_token is causing a size mismatch
size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

Error Logs

ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'}).

Additional Information

No response

@Chitran-0249f Chitran-0249f added the bug Something isn't working label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant