Merge branch 'adapter-hub:main' into implement_vera

adapter-hub · Feb 4, 2025 · 4fc76aa · 4fc76aa
2 parents f587a1b + 326d071
commit 4fc76aa
Show file tree

Hide file tree

Showing 166 changed files with 3,208 additions and 2,227 deletions.
diff --git a/.github/workflows/adapter_docs_build.yml b/.github/workflows/adapter_docs_build.yml
@@ -18,7 +18,7 @@ jobs:
           fetch-depth: 0
       - uses: actions/setup-python@v2
         with:
-          python-version: 3.8
+          python-version: "3.10"
       - name: Install
         run: |
           pip install setuptools==57.4.0

diff --git a/.github/workflows/tests_torch.yml b/.github/workflows/tests_torch.yml
@@ -32,8 +32,8 @@ jobs:
           submodules: true
       - uses: actions/setup-python@v2
         with:
-          python-version: 3.8
-      - uses: actions/cache@v2
+          python-version: "3.10"
+      - uses: actions/cache@v4
         with:
           path: ~/.cache/pip
           key: ${{ runner.os }}-pip-${{ hashFiles('setup.py') }}
@@ -53,8 +53,8 @@ jobs:
           submodules: true
       - uses: actions/setup-python@v2
         with:
-          python-version: 3.8
-      - uses: actions/cache@v2
+          python-version: "3.10"
+      - uses: actions/cache@v4
         with:
           path: ~/.cache/pip
           key: ${{ runner.os }}-pip-${{ hashFiles('setup.py') }}
@@ -76,8 +76,8 @@ jobs:
           submodules: true
       - uses: actions/setup-python@v2
         with:
-          python-version: 3.8
-      - uses: actions/cache@v2
+          python-version: "3.10"
+      - uses: actions/cache@v4
         with:
           path: ~/.cache/pip
           key: ${{ runner.os }}-pip-${{ hashFiles('setup.py') }}
@@ -99,8 +99,8 @@ jobs:
           submodules: true
       - uses: actions/setup-python@v2
         with:
-          python-version: 3.8
-      - uses: actions/cache@v2
+          python-version: "3.10"
+      - uses: actions/cache@v4
         with:
           path: ~/.cache/pip
           key: ${{ runner.os }}-pip-${{ hashFiles('setup.py') }}

diff --git a/Makefile b/Makefile
@@ -28,18 +28,29 @@ style:
 	isort $(check_dirs)
 	${MAKE} extra_style_checks
 
-# Run tests for the library
+# Library Tests
 
+# run all tests in the library
 test:
 	python -m pytest -n auto --dist=loadfile -s -v ./tests/
+	python -c "import transformers; print(transformers.__version__)"
 
+# run all tests for the adapter methods for all adapter models
 test-adapter-methods:
-	python -m pytest --ignore ./tests/models -n auto --dist=loadfile -s -v ./tests/
+	python -m pytest -n auto --dist=loadfile -s -v ./tests/test_methods/
 
+# run a subset of the adapter method tests for all adapter models
+# list of all subsets: [core, heads, embeddings, composition, prefix_tuning, prompt_tuning, reft, unipelt, compacter, bottleneck, ia3, lora, config_union]
+subset ?=
+test-adapter-method-subset:
+	@echo "Running subset $(subset)"
+	python -m pytest -n auto --dist=loadfile -s -v ./tests/test_methods/ -m $(subset)
+
+
+# run the hugginface test suite for all adapter models
 test-adapter-models:
-	python -m pytest -n auto --dist=loadfile -s -v ./tests/models
+	python -m pytest -n auto --dist=loadfile -s -v ./tests/test_models/
 
 # Run tests for examples
-
 test-examples:
 	python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/
diff --git a/README.md b/README.md
@@ -32,7 +32,7 @@ A Unified Library for Parameter-Efficient and Modular Transfer Learning
     <a href="https://arxiv.org/abs/2311.11077">Paper</a>
 </h3>
 
-![Tests](https://github.com/Adapter-Hub/adapters/workflows/Tests/badge.svg?branch=adapters)
+![Tests](https://github.com/Adapter-Hub/adapters/workflows/Tests/badge.svg)
 [![GitHub](https://img.shields.io/github/license/adapter-hub/adapters.svg?color=blue)](https://github.com/adapter-hub/adapters/blob/main/LICENSE)
 [![PyPI](https://img.shields.io/pypi/v/adapters)](https://pypi.org/project/adapters/)
 
@@ -45,7 +45,7 @@ _Adapters_ provides a unified interface for efficient fine-tuning and modular tr
 
 ## Installation
 
-`adapters` currently supports **Python 3.8+** and **PyTorch 1.10+**.
+`adapters` currently supports **Python 3.9+** and **PyTorch 2.0+**.
 After [installing PyTorch](https://pytorch.org/get-started/locally/), you can install `adapters` from PyPI ...
 
 ```
@@ -147,7 +147,7 @@ Currently, adapters integrates all architectures and methods listed below:
 
 | Method | Paper(s) | Quick Links |
 | --- | --- | --- |
-| Bottleneck adapters | [Houlsby et al. (2019)](https://arxiv.org/pdf/1902.00751.pdf)<br> [Bapna and Firat (2019)](https://arxiv.org/pdf/1909.08478.pdf) | [Quickstart](https://docs.adapterhub.ml/quickstart.html), [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb) |
+| Bottleneck adapters | [Houlsby et al. (2019)](https://arxiv.org/pdf/1902.00751.pdf)<br> [Bapna and Firat (2019)](https://arxiv.org/pdf/1909.08478.pdf)<br> [Steitz and Roth (2024)](https://openaccess.thecvf.com/content/CVPR2024/papers/Steitz_Adapters_Strike_Back_CVPR_2024_paper.pdf) | [Quickstart](https://docs.adapterhub.ml/quickstart.html), [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb) |
 | AdapterFusion | [Pfeiffer et al. (2021)](https://aclanthology.org/2021.eacl-main.39.pdf) | [Docs: Training](https://docs.adapterhub.ml/training.html#train-adapterfusion), [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) |
 | MAD-X,<br> Invertible adapters | [Pfeiffer et al. (2020)](https://aclanthology.org/2020.emnlp-main.617/) | [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/04_Cross_Lingual_Transfer.ipynb) |
 | AdapterDrop | [Rücklé et al. (2021)](https://arxiv.org/pdf/2010.11918.pdf) | [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/05_Adapter_Drop_Training.ipynb) |

diff --git a/conftest.py b/conftest.py
@@ -87,3 +87,8 @@ def check_output(self, want, got, optionflags):
 
 
 doctest.OutputChecker = CustomOutputChecker
+
+
+def pytest_collection_modifyitems(items):
+    # Exclude the 'test_class' group from the test collection since it's not a real test class and byproduct of the generic test class generation.
+    items[:] = [item for item in items if 'test_class' not in item.nodeid]
diff --git a/docs/adapter_composition.md b/docs/adapter_composition.md
@@ -125,6 +125,8 @@ model.active_adapters = ac.Fuse("d", "e", "f")
 
 To learn how training an _AdapterFusion_ layer works, check out [this Colab notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) from the `adapters` repo.
 
+To save and upload the full composition setup with adapters and fusion layer in one line of code, check out the docs on [saving and loading adapter compositions](loading.md#saving-and-loading-adapter-compositions).
+
 ### Retrieving AdapterFusion attentions
 
 Finally, it is possible to retrieve the attention scores computed by each fusion layer in a forward pass of the model.

diff --git a/docs/installation.md b/docs/installation.md
@@ -1,7 +1,7 @@
 # Installation
 
 The `adapters` package is designed as an add-on for Hugging Face's Transformers library.
-It currently supports Python 3.8+ and PyTorch 1.10+. You will have to [install PyTorch](https://pytorch.org/get-started/locally/) first. 
+It currently supports Python 3.9+ and PyTorch 2.0+. You will have to [install PyTorch](https://pytorch.org/get-started/locally/) first. 
 
 ```{eval-rst}
 .. important::

diff --git a/docs/loading.md b/docs/loading.md
@@ -94,3 +94,39 @@ We will go through the different arguments and their meaning one by one:
 To load the adapter using a custom name, we can use the `load_as` parameter.
 
 - Finally, `set_active` will directly activate the loaded adapter for usage in each model forward pass. Otherwise, you have to manually activate the adapter via `set_active_adapters()`.
+
+## Saving and loading adapter compositions
+
+In addition to saving and loading individual adapters, you can also save, load and share entire [compositions of adapters](adapter_composition.md) with a single line of code.
+_Adapters_ provides three methods for this purpose that work very similar to those for single adapters:
+
+- [`save_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter_setup) to save an adapter composition along with prediction heads to the local file system.
+- [`load_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.load_adapter_setup) to load a saved adapter composition from the local file system or the Model Hub.
+- [`push_adapter_setup_to_hub()`](adapters.hub_mixin.PushAdapterToHubMixin.push_adapter_setup_to_hub) to upload an adapter setup along with prediction heads to the Model Hub. See our [Hugging Face Model Hub guide](huggingface_hub.md) for more.
+
+As an example, this is how you would save and load an AdapterFusion setup of three adapters with a prediction head:
+
+```python
+# Create an AdapterFusion
+model = AutoAdapterModel.from_pretrained("bert-base-uncased")
+model.load_adapter("sentiment/sst-2@ukp", config=SeqBnConfig(), with_head=False)
+model.load_adapter("nli/multinli@ukp", config=SeqBnConfig(), with_head=False)
+model.load_adapter("sts/qqp@ukp", config=SeqBnConfig(), with_head=False)
+model.add_adapter_fusion(["sst-2", "mnli", "qqp"])
+model.add_classification_head("clf_head")
+adapter_setup = Fuse("sst-2", "mnli", "qqp")
+head_setup = "clf_head"
+model.set_active_adapters(adapter_setup)
+model.active_head = head_setup
+
+# Train AdapterFusion ...
+
+# Save
+model.save_adapter_setup("checkpoint", adapter_setup, head_setup=head_setup)
+
+# Push to Hub
+model.push_adapter_setup_to_hub("<user>/fusion_setup", adapter_setup, head_setup=head_setup)
+
+# Re-load
+# model.load_adapter_setup("checkpoint", set_active=True)
+```
diff --git a/docs/quickstart.md b/docs/quickstart.md
@@ -105,7 +105,7 @@ model = AutoAdapterModel.from_pretrained(example_path)
 model.load_adapter(example_path)
 ```
 
-Similar to how the weights of the full model are saved, the `save_adapter()` will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.
+Similar to how the weights of the full model are saved, [`save_adapter()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter) will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.
 
 Finally, if we have finished working with adapters, we can restore the base Transformer to its original form by deactivating and deleting the adapter:
 

diff --git a/docs/training.md b/docs/training.md
@@ -223,3 +223,7 @@ trainer = AdapterTrainer(
 _Adapters_ supports fine-tuning of quantized language models similar to [QLoRA (Dettmers et al., 2023)](https://arxiv.org/pdf/2305.14314.pdf) via the `bitsandbytes` library integrated into Transformers.
 Quantized training is supported for LoRA-based adapters as well as bottleneck adapters and prefix tuning.
 Please refer to [this notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/QLoRA_Llama_Finetuning.ipynb) for a hands-on guide.
+
+## Gradient Checkpointing
+Gradient checkpointing is supported for all models (e.g. Llama 1/2/3) except for the models that are not supported by Hugging Face Transformers (like ALBERT). Please refer to [this notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/Gradient_Checkpointing_Llama.ipynb) for a hands-on guide.
+
diff --git a/examples/pytorch/language-modeling/run_clm.py b/examples/pytorch/language-modeling/run_clm.py
@@ -442,7 +442,7 @@ def main():
     else:
         model = AutoModelForCausalLM.from_config(config, trust_remote_code=model_args.trust_remote_code)
         n_params = sum({p.data_ptr(): p.numel() for p in model.parameters()}.values())
-        logger.info(f"Training new model from scratch - Total size={n_params/2**20:.2f}M params")
+        logger.info(f"Training new model from scratch - Total size={n_params / 2**20:.2f}M params")
 
     # Convert the model into an adapter model
     adapters.init(model)

diff --git a/hf_transformers b/hf_transformers