Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix padding token patching #279

Closed
wants to merge 1 commit into from
Closed

Conversation

taha-yassine
Copy link

Currently, the condition to check if a model's tokenizer has a padding token is wrong and always evaluates to True, meaning all models get patched whether they require it or not.

from nnsight import LanguageModel

model = LanguageModel('google/gemma-2-2b')
print(model.tokenizer.pad_token)
# <eos>

Normally, hasattr() should be used on an object to check whether it has a given attribute or not, while here it was used on the attribute directly (hasattr(my_object, "my_attribute") vs hasattr(my_object.my_attribute, "my_attribute")). But it shouldn't be used at all here because even models without a padding token have a pad_token attribute, it's just set to None which is what's tested for in this PR.

Maybe related to #177 but couldn't reproduce the error there.

@JadenFiotto-Kaufman
Copy link
Member

@taha-yassine This is fixed on the 0.4 branch 120691a . Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants