Upgrade Transformers to v4.38.x #654

lenglaender · 2024-02-23T19:09:45Z

Changes:

HF changed parts of the Llama model implementation
HF added a LlamaForQuestionAnswering. However, this model has a wrong base model name. I added a workaround that solves this problem until this is fixed in Transformers (Fix Base Model Name of LlamaForQuestionAnswering huggingface/transformers#29258)

calpt · 2024-02-27T18:02:59Z

src/adapters/models/llama/adapter_model.py

@@ -149,3 +152,14 @@ def prepare_inputs_for_generation(
            }
        )
        return model_inputs
+
+    def _load_pretrained_model(cls, model, state_dict, loaded_keys, *args, **kwargs):


Why is this rewriting of the state dict required for us but not for HF?

I believe this is the case because when using Hugging Face, the question answering llama is loaded using the AutoModelForQuestionAnswering class, which utilises the models in MODEL_FOR_QUESTION_ANSWERING_MAPPING_NAMES and this contains LlamaForQuestionAnswering. LlamaForQuestionAnswering has "transformer" as the base model (https://github.com/huggingface/transformers/blob/2209b7afa04b3a6366350065f541e9248d6663c2/src/transformers/models/llama/modeling_llama.py#L1460)

But we load all models using the same AutoAdapterModel class: In LlamaAdapterModel, we set self.model = LlamaModel(config) and because of this the state_dict and model keys must have "model" as the base model.

lenglaender · 2024-03-06T19:42:19Z

@calpt, as you pointed out, as soon as we load a Llama model that is sharded via AutoAdapterModel, it does not add the head.
But if someone saved a smaller non-sharded LlamaForQuestionAnswering (e.g. based on TinyLlama), we can then load it. So, for this use case, having the _load_pretrained_model function overwritten is useful.
I believe we can merge the sync PR now, right?

We should also add to the docs the information that sharded models cannot be loaded as FlexModels with their heads.

calpt · 2024-03-12T12:55:37Z

@calpt, as you pointed out, as soon as we load a Llama model that is sharded via AutoAdapterModel, it does not add the head. But if someone saved a smaller non-sharded LlamaForQuestionAnswering (e.g. based on TinyLlama), we can then load it. So, for this use case, having the _load_pretrained_model function overwritten is useful. I believe we can merge the sync PR now, right?

We should also add to the docs the information that sharded models cannot be loaded as FlexModels with their heads.

This seems be a rather complex code patch for something that is likely not really used. I'd rather just drop support for LlamaForQA altogether if it doesn't work for sharded checkpoints anyway, or do you think this is an important use case to support?

lenglaender · 2024-03-18T22:22:21Z

Removed _load_pretrained_model for the LlamaAdapterModel and updated the docs.

While updating the docs, I noticed that our sphinx version had some issues that others encountered, too: sphinx-doc/sphinx#11890
Hence, I set the version to 5.0.2. This seems to change nothing; at least, I couldn't find any visual changes in the new build docs.

I also noticed that since #641 the docs don't include the add_XXX_head methods for every model, we have to fix this. I didn't find a quick fix, so I suppose we have to fix this in a separate PR.

@calpt I believe we can merge this PR now after you approve it.

calpt · 2024-03-20T21:17:09Z

I also noticed that since #641 the docs don't include the add_XXX_head methods for every model, we have to fix this. I didn't find a quick fix, so I suppose we have to fix this in a separate PR.

In the live version of our docs, we do have those, right? E.g. https://docs.adapterhub.ml/classes/models/llama.html#adapters.LlamaAdapterModel.add_classification_head or https://docs.adapterhub.ml/classes/models/bert.html#adapters.BertAdapterModel.add_dependency_parsing_head.

So is this issue introduced by the changes in this PR?

docs/classes/models/auto.rst

calpt · 2024-04-06T12:25:03Z

@lenglaender I've fixed the docs build issues

lenglaender added 5 commits February 22, 2024 23:03

hf transformers

624bb95

also forgot to push setup.py

1192c2d

llama fixes

f96b085

fix llama qa model

2130032

make style

2e97107

lenglaender requested a review from calpt February 23, 2024 19:09

corrected misspelled class

b707bc2

calpt reviewed Feb 27, 2024

View reviewed changes

LlamaAdapterModel improve renaming state_dict and model

b892601

lenglaender added 3 commits March 18, 2024 22:30

remove _load_pretrained_model from LlamaAdapterModel & update docs

487e07e

remove TODO to rename the Hugging Face library key

c43ab9f

fix broken links in docs & fix test

8d5ecff

calpt approved these changes Mar 20, 2024

View reviewed changes

docs/classes/models/auto.rst Outdated Show resolved Hide resolved

calpt added 2 commits April 6, 2024 14:00

Update docs/classes/models/auto.rst

d879301

Fix docs issues

73ceaba

calpt changed the title ~~Upgrade Transformers to v4.38.1~~ Upgrade Transformers to v4.38.x Apr 6, 2024

calpt added the sync label Apr 6, 2024

calpt merged commit a9152e7 into adapter-hub:main Apr 6, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade Transformers to v4.38.x #654

Upgrade Transformers to v4.38.x #654

lenglaender commented Feb 23, 2024

calpt Feb 27, 2024

lenglaender Feb 28, 2024

lenglaender commented Mar 6, 2024

calpt commented Mar 12, 2024

lenglaender commented Mar 18, 2024

calpt commented Mar 20, 2024

calpt commented Apr 6, 2024

Upgrade Transformers to v4.38.x #654

Upgrade Transformers to v4.38.x #654

Conversation

lenglaender commented Feb 23, 2024

calpt Feb 27, 2024

Choose a reason for hiding this comment

lenglaender Feb 28, 2024

Choose a reason for hiding this comment

lenglaender commented Mar 6, 2024

calpt commented Mar 12, 2024

lenglaender commented Mar 18, 2024

calpt commented Mar 20, 2024

calpt commented Apr 6, 2024