forked from huggingface/diffusers
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* first pass at docs structure * minor reformatting, add github actions for docs * populate docs (primarily from README, some writing)
- Loading branch information
Nathan Lambert
authored
Jul 13, 2022
1 parent
2a69c0b
commit c3d78cd
Showing
16 changed files
with
527 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
name: Build documentation | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
- doc-builder* | ||
- v*-release | ||
|
||
jobs: | ||
build: | ||
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main | ||
with: | ||
commit_sha: ${{ github.sha }} | ||
package: diffusers | ||
secrets: | ||
token: ${{ secrets.HUGGINGFACE_PUSH }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: Build PR Documentation | ||
|
||
on: | ||
pull_request: | ||
|
||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
build: | ||
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main | ||
with: | ||
commit_sha: ${{ github.event.pull_request.head.sha }} | ||
pr_number: ${{ github.event.number }} | ||
package: diffusers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
name: Delete dev documentation | ||
|
||
on: | ||
pull_request: | ||
types: [ closed ] | ||
|
||
|
||
jobs: | ||
delete: | ||
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main | ||
with: | ||
pr_number: ${{ github.event.number }} | ||
package: diffusers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
- sections: | ||
- local: index | ||
title: 🧨 Diffusers | ||
- local: quicktour | ||
title: Quicktour | ||
- local: philosophy | ||
title: Philosophy | ||
title: Get started | ||
- sections: | ||
- sections: | ||
- local: examples/diffusers_for_vision | ||
title: Diffusers for Vision | ||
- local: examples/diffusers_for_audio | ||
title: Diffusers for Audio | ||
- local: examples/diffusers_for_other | ||
title: Diffusers for Other Modalities | ||
title: Examples | ||
title: Using Diffusers | ||
- sections: | ||
- sections: | ||
- local: pipelines | ||
title: Pipelines | ||
- local: schedulers | ||
title: Schedulers | ||
- local: models | ||
title: Models | ||
title: Main Classes | ||
- sections: | ||
- local: pipelines/glide | ||
title: "Glide" | ||
title: Pipelines | ||
- sections: | ||
- local: schedulers/ddpm | ||
title: "DDPM" | ||
title: Schedulers | ||
- sections: | ||
- local: models/unet | ||
title: "Unet" | ||
title: Models | ||
title: API |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
<!--Copyright 2022 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Diffusers for audio |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
<!--Copyright 2022 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Diffusers for other modalities | ||
|
||
Diffusers offers support to other modalities than vision and audio. | ||
Currently, some examples include: | ||
- [Diffuser](https://diffusion-planning.github.io/) for planning in reinforcement learning (currenlty only inference): [](https://colab.research.google.com/drive/1TmBmlYeKUZSkUZoJqfBmaicVTKx6nN1R?usp=sharing) | ||
|
||
If you are interested in contributing to under-construction examples, you can explore: | ||
- [GeoDiff](https://github.com/MinkaiXu/GeoDiff) for generating 3D configurations of molecule diagrams [](https://colab.research.google.com/drive/1pLYYWQhdLuv1q-JtEHGZybxp2RBF8gPs?usp=sharing). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
<!--Copyright 2022 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Diffusers for vision | ||
|
||
## Direct image generation | ||
|
||
#### **Example image generation with PNDM** | ||
|
||
```python | ||
from diffusers import PNDM, UNetModel, PNDMScheduler | ||
import PIL.Image | ||
import numpy as np | ||
import torch | ||
|
||
model_id = "fusing/ddim-celeba-hq" | ||
|
||
model = UNetModel.from_pretrained(model_id) | ||
scheduler = PNDMScheduler() | ||
|
||
# load model and scheduler | ||
pndm = PNDM(unet=model, noise_scheduler=scheduler) | ||
|
||
# run pipeline in inference (sample random noise and denoise) | ||
with torch.no_grad(): | ||
image = pndm() | ||
|
||
# process image to PIL | ||
image_processed = image.cpu().permute(0, 2, 3, 1) | ||
image_processed = (image_processed + 1.0) / 2 | ||
image_processed = torch.clamp(image_processed, 0.0, 1.0) | ||
image_processed = image_processed * 255 | ||
image_processed = image_processed.numpy().astype(np.uint8) | ||
image_pil = PIL.Image.fromarray(image_processed[0]) | ||
|
||
# save image | ||
image_pil.save("test.png") | ||
``` | ||
|
||
#### **Example 1024x1024 image generation with SDE VE** | ||
|
||
See [paper](https://arxiv.org/abs/2011.13456) for more information on SDE VE. | ||
|
||
```python | ||
from diffusers import DiffusionPipeline | ||
import torch | ||
import PIL.Image | ||
import numpy as np | ||
|
||
torch.manual_seed(32) | ||
|
||
score_sde_sv = DiffusionPipeline.from_pretrained("fusing/ffhq_ncsnpp") | ||
|
||
# Note this might take up to 3 minutes on a GPU | ||
image = score_sde_sv(num_inference_steps=2000) | ||
|
||
image = image.permute(0, 2, 3, 1).cpu().numpy() | ||
image = np.clip(image * 255, 0, 255).astype(np.uint8) | ||
image_pil = PIL.Image.fromarray(image[0]) | ||
|
||
# save image | ||
image_pil.save("test.png") | ||
``` | ||
#### **Example 32x32 image generation with SDE VP** | ||
|
||
See [paper](https://arxiv.org/abs/2011.13456) for more information on SDE VE. | ||
|
||
```python | ||
from diffusers import DiffusionPipeline | ||
import torch | ||
import PIL.Image | ||
import numpy as np | ||
|
||
torch.manual_seed(32) | ||
|
||
score_sde_sv = DiffusionPipeline.from_pretrained("fusing/cifar10-ddpmpp-deep-vp") | ||
|
||
# Note this might take up to 3 minutes on a GPU | ||
image = score_sde_sv(num_inference_steps=1000) | ||
|
||
image = image.permute(0, 2, 3, 1).cpu().numpy() | ||
image = np.clip(image * 255, 0, 255).astype(np.uint8) | ||
image_pil = PIL.Image.fromarray(image[0]) | ||
|
||
# save image | ||
image_pil.save("test.png") | ||
``` | ||
|
||
|
||
#### **Text to Image generation with Latent Diffusion** | ||
|
||
_Note: To use latent diffusion install transformers from [this branch](https://github.com/patil-suraj/transformers/tree/ldm-bert)._ | ||
|
||
```python | ||
from diffusers import DiffusionPipeline | ||
|
||
ldm = DiffusionPipeline.from_pretrained("fusing/latent-diffusion-text2im-large") | ||
|
||
generator = torch.manual_seed(42) | ||
|
||
prompt = "A painting of a squirrel eating a burger" | ||
image = ldm([prompt], generator=generator, eta=0.3, guidance_scale=6.0, num_inference_steps=50) | ||
|
||
image_processed = image.cpu().permute(0, 2, 3, 1) | ||
image_processed = image_processed * 255. | ||
image_processed = image_processed.numpy().astype(np.uint8) | ||
image_pil = PIL.Image.fromarray(image_processed[0]) | ||
|
||
# save image | ||
image_pil.save("test.png") | ||
``` | ||
|
||
|
||
## Text to image generation | ||
|
||
```python | ||
import torch | ||
from diffusers import BDDMPipeline, GradTTSPipeline | ||
|
||
torch_device = "cuda" | ||
|
||
# load grad tts and bddm pipelines | ||
grad_tts = GradTTSPipeline.from_pretrained("fusing/grad-tts-libri-tts") | ||
bddm = BDDMPipeline.from_pretrained("fusing/diffwave-vocoder-ljspeech") | ||
|
||
text = "Hello world, I missed you so much." | ||
|
||
# generate mel spectograms using text | ||
mel_spec = grad_tts(text, torch_device=torch_device) | ||
|
||
# generate the speech by passing mel spectograms to BDDMPipeline pipeline | ||
generator = torch.manual_seed(42) | ||
audio = bddm(mel_spec, generator, torch_device=torch_device) | ||
|
||
# save generated audio | ||
from scipy.io.wavfile import write as wavwrite | ||
sampling_rate = 22050 | ||
wavwrite("generated_audio.wav", sampling_rate, audio.squeeze().cpu().numpy()) | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
<!--Copyright 2022 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
<p align="center"> | ||
<br> | ||
<img src="https://raw.githubusercontent.com/huggingface/diffusers/77aadfee6a891ab9fcfb780f87c693f7a5beeb8e/docs/source/imgs/diffusers_library.jpg" width="400"/> | ||
<br> | ||
</p> | ||
|
||
# 🧨 Diffusers | ||
|
||
|
||
🤗 Diffusers provides pretrained diffusion models across multiple modalities, such as vision and audio, and serves | ||
as a modular toolbox for inference and training of diffusion models. | ||
|
||
More precisely, 🤗 Diffusers offers: | ||
|
||
- State-of-the-art diffusion pipelines that can be run in inference with just a couple of lines of code (see [src/diffusers/pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines)). | ||
- Various noise schedulers that can be used interchangeably for the prefered speed vs. quality trade-off in inference (see [src/diffusers/schedulers](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers)). | ||
- Multiple types of models, such as UNet, that can be used as building blocks in an end-to-end diffusion system (see [src/diffusers/models](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models)). | ||
- Training examples to show how to train the most popular diffusion models (see [examples](https://github.com/huggingface/diffusers/tree/main/examples)). | ||
|
||
# Installation | ||
|
||
Install Diffusers for with PyTorch. Support for other libraries will come in the future | ||
|
||
🤗 Diffusers is tested on Python 3.6+, and PyTorch 1.4.0+. | ||
|
||
## Install with pip | ||
|
||
You should install 🤗 Diffusers in a [virtual environment](https://docs.python.org/3/library/venv.html). | ||
If you're unfamiliar with Python virtual environments, take a look at this [guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/). | ||
A virtual environment makes it easier to manage different projects, and avoid compatibility issues between dependencies. | ||
|
||
Start by creating a virtual environment in your project directory: | ||
|
||
```bash | ||
python -m venv .env | ||
``` | ||
|
||
Activate the virtual environment: | ||
|
||
```bash | ||
source .env/bin/activate | ||
``` | ||
|
||
Now you're ready to install 🤗 Diffusers with the following command: | ||
|
||
```bash | ||
pip install diffusers | ||
``` | ||
|
||
## Install from source | ||
|
||
Install 🤗 Diffusers from source with the following command: | ||
|
||
```bash | ||
pip install git+https://github.com/huggingface/diffusers | ||
``` | ||
|
||
This command installs the bleeding edge `main` version rather than the latest `stable` version. | ||
The `main` version is useful for staying up-to-date with the latest developments. | ||
For instance, if a bug has been fixed since the last official release but a new release hasn't been rolled out yet. | ||
However, this means the `main` version may not always be stable. | ||
We strive to keep the `main` version operational, and most issues are usually resolved within a few hours or a day. | ||
If you run into a problem, please open an [Issue](https://github.com/huggingface/transformers/issues) so we can fix it even sooner! | ||
|
||
## Editable install | ||
|
||
You will need an editable install if you'd like to: | ||
|
||
* Use the `main` version of the source code. | ||
* Contribute to 🤗 Diffusers and need to test changes in the code. | ||
|
||
Clone the repository and install 🤗 Diffusers with the following commands: | ||
|
||
```bash | ||
git clone https://github.com/huggingface/diffusers.git | ||
cd transformers | ||
pip install -e . | ||
``` | ||
|
||
These commands will link the folder you cloned the repository to and your Python library paths. | ||
Python will now look inside the folder you cloned to in addition to the normal library paths. | ||
For example, if your Python packages are typically installed in `~/anaconda3/envs/main/lib/python3.7/site-packages/`, Python will also search the folder you cloned to: `~/diffusers/`. | ||
|
||
<Tip warning={true}> | ||
|
||
You must keep the `diffusers` folder if you want to keep using the library. | ||
|
||
</Tip> | ||
|
||
Now you can easily update your clone to the latest version of 🤗 Diffusers with the following command: | ||
|
||
```bash | ||
cd ~/diffusers/ | ||
git pull | ||
``` | ||
|
||
Your Python environment will find the `main` version of 🤗 Diffuers on the next run. | ||
|
Oops, something went wrong.