Fix padding token patching #279

taha-yassine · 2024-10-25T00:36:49Z

Currently, the condition to check if a model's tokenizer has a padding token is wrong and always evaluates to True, meaning all models get patched whether they require it or not.

from nnsight import LanguageModel

model = LanguageModel('google/gemma-2-2b')
print(model.tokenizer.pad_token)
# <eos>

Normally, hasattr() should be used on an object to check whether it has a given attribute or not, while here it was used on the attribute directly (hasattr(my_object, "my_attribute") vs hasattr(my_object.my_attribute, "my_attribute")). But it shouldn't be used at all here because even models without a padding token have a pad_token attribute, it's just set to None which is what's tested for in this PR.

Maybe related to #177 but couldn't reproduce the error there.

JadenFiotto-Kaufman · 2024-12-02T18:15:15Z

@taha-yassine This is fixed on the 0.4 branch 120691a . Thanks!

Fix padding token patching

31499b5

JadenFiotto-Kaufman closed this Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix padding token patching #279

Fix padding token patching #279

taha-yassine commented Oct 25, 2024

JadenFiotto-Kaufman commented Dec 2, 2024

Fix padding token patching #279

Fix padding token patching #279

Conversation

taha-yassine commented Oct 25, 2024

JadenFiotto-Kaufman commented Dec 2, 2024