Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
* first pass at docs structure

* minor reformatting, add github actions for docs

* populate docs (primarily from README, some writing)
  • Loading branch information
Nathan Lambert authored Jul 13, 2022
1 parent 2a69c0b commit c3d78cd
Show file tree
Hide file tree
Showing 16 changed files with 527 additions and 0 deletions.
17 changes: 17 additions & 0 deletions .github/workflows/build_documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Build documentation

on:
push:
branches:
- main
- doc-builder*
- v*-release

jobs:
build:
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
with:
commit_sha: ${{ github.sha }}
package: diffusers
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
16 changes: 16 additions & 0 deletions .github/workflows/build_pr_documentation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: Build PR Documentation

on:
pull_request:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
build:
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
with:
commit_sha: ${{ github.event.pull_request.head.sha }}
pr_number: ${{ github.event.number }}
package: diffusers
13 changes: 13 additions & 0 deletions .github/workflows/delete_doc_comment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: Delete dev documentation

on:
pull_request:
types: [ closed ]


jobs:
delete:
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
with:
pr_number: ${{ github.event.number }}
package: diffusers
40 changes: 40 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
- sections:
- local: index
title: 🧨 Diffusers
- local: quicktour
title: Quicktour
- local: philosophy
title: Philosophy
title: Get started
- sections:
- sections:
- local: examples/diffusers_for_vision
title: Diffusers for Vision
- local: examples/diffusers_for_audio
title: Diffusers for Audio
- local: examples/diffusers_for_other
title: Diffusers for Other Modalities
title: Examples
title: Using Diffusers
- sections:
- sections:
- local: pipelines
title: Pipelines
- local: schedulers
title: Schedulers
- local: models
title: Models
title: Main Classes
- sections:
- local: pipelines/glide
title: "Glide"
title: Pipelines
- sections:
- local: schedulers/ddpm
title: "DDPM"
title: Schedulers
- sections:
- local: models/unet
title: "Unet"
title: Models
title: API
13 changes: 13 additions & 0 deletions docs/source/examples/diffusers_for_audio.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Diffusers for audio
20 changes: 20 additions & 0 deletions docs/source/examples/diffusers_for_other.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Diffusers for other modalities

Diffusers offers support to other modalities than vision and audio.
Currently, some examples include:
- [Diffuser](https://diffusion-planning.github.io/) for planning in reinforcement learning (currenlty only inference): [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1TmBmlYeKUZSkUZoJqfBmaicVTKx6nN1R?usp=sharing)

If you are interested in contributing to under-construction examples, you can explore:
- [GeoDiff](https://github.com/MinkaiXu/GeoDiff) for generating 3D configurations of molecule diagrams [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1pLYYWQhdLuv1q-JtEHGZybxp2RBF8gPs?usp=sharing).
149 changes: 149 additions & 0 deletions docs/source/examples/diffusers_for_vision.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Diffusers for vision

## Direct image generation

#### **Example image generation with PNDM**

```python
from diffusers import PNDM, UNetModel, PNDMScheduler
import PIL.Image
import numpy as np
import torch

model_id = "fusing/ddim-celeba-hq"

model = UNetModel.from_pretrained(model_id)
scheduler = PNDMScheduler()

# load model and scheduler
pndm = PNDM(unet=model, noise_scheduler=scheduler)

# run pipeline in inference (sample random noise and denoise)
with torch.no_grad():
image = pndm()

# process image to PIL
image_processed = image.cpu().permute(0, 2, 3, 1)
image_processed = (image_processed + 1.0) / 2
image_processed = torch.clamp(image_processed, 0.0, 1.0)
image_processed = image_processed * 255
image_processed = image_processed.numpy().astype(np.uint8)
image_pil = PIL.Image.fromarray(image_processed[0])

# save image
image_pil.save("test.png")
```

#### **Example 1024x1024 image generation with SDE VE**

See [paper](https://arxiv.org/abs/2011.13456) for more information on SDE VE.

```python
from diffusers import DiffusionPipeline
import torch
import PIL.Image
import numpy as np

torch.manual_seed(32)

score_sde_sv = DiffusionPipeline.from_pretrained("fusing/ffhq_ncsnpp")

# Note this might take up to 3 minutes on a GPU
image = score_sde_sv(num_inference_steps=2000)

image = image.permute(0, 2, 3, 1).cpu().numpy()
image = np.clip(image * 255, 0, 255).astype(np.uint8)
image_pil = PIL.Image.fromarray(image[0])

# save image
image_pil.save("test.png")
```
#### **Example 32x32 image generation with SDE VP**

See [paper](https://arxiv.org/abs/2011.13456) for more information on SDE VE.

```python
from diffusers import DiffusionPipeline
import torch
import PIL.Image
import numpy as np

torch.manual_seed(32)

score_sde_sv = DiffusionPipeline.from_pretrained("fusing/cifar10-ddpmpp-deep-vp")

# Note this might take up to 3 minutes on a GPU
image = score_sde_sv(num_inference_steps=1000)

image = image.permute(0, 2, 3, 1).cpu().numpy()
image = np.clip(image * 255, 0, 255).astype(np.uint8)
image_pil = PIL.Image.fromarray(image[0])

# save image
image_pil.save("test.png")
```


#### **Text to Image generation with Latent Diffusion**

_Note: To use latent diffusion install transformers from [this branch](https://github.com/patil-suraj/transformers/tree/ldm-bert)._

```python
from diffusers import DiffusionPipeline

ldm = DiffusionPipeline.from_pretrained("fusing/latent-diffusion-text2im-large")

generator = torch.manual_seed(42)

prompt = "A painting of a squirrel eating a burger"
image = ldm([prompt], generator=generator, eta=0.3, guidance_scale=6.0, num_inference_steps=50)

image_processed = image.cpu().permute(0, 2, 3, 1)
image_processed = image_processed * 255.
image_processed = image_processed.numpy().astype(np.uint8)
image_pil = PIL.Image.fromarray(image_processed[0])

# save image
image_pil.save("test.png")
```


## Text to image generation

```python
import torch
from diffusers import BDDMPipeline, GradTTSPipeline

torch_device = "cuda"

# load grad tts and bddm pipelines
grad_tts = GradTTSPipeline.from_pretrained("fusing/grad-tts-libri-tts")
bddm = BDDMPipeline.from_pretrained("fusing/diffwave-vocoder-ljspeech")

text = "Hello world, I missed you so much."

# generate mel spectograms using text
mel_spec = grad_tts(text, torch_device=torch_device)

# generate the speech by passing mel spectograms to BDDMPipeline pipeline
generator = torch.manual_seed(42)
audio = bddm(mel_spec, generator, torch_device=torch_device)

# save generated audio
from scipy.io.wavfile import write as wavwrite
sampling_rate = 22050
wavwrite("generated_audio.wav", sampling_rate, audio.squeeze().cpu().numpy())
```

110 changes: 110 additions & 0 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

<p align="center">
<br>
<img src="https://raw.githubusercontent.com/huggingface/diffusers/77aadfee6a891ab9fcfb780f87c693f7a5beeb8e/docs/source/imgs/diffusers_library.jpg" width="400"/>
<br>
</p>

# 🧨 Diffusers


🤗 Diffusers provides pretrained diffusion models across multiple modalities, such as vision and audio, and serves
as a modular toolbox for inference and training of diffusion models.

More precisely, 🤗 Diffusers offers:

- State-of-the-art diffusion pipelines that can be run in inference with just a couple of lines of code (see [src/diffusers/pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines)).
- Various noise schedulers that can be used interchangeably for the prefered speed vs. quality trade-off in inference (see [src/diffusers/schedulers](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers)).
- Multiple types of models, such as UNet, that can be used as building blocks in an end-to-end diffusion system (see [src/diffusers/models](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models)).
- Training examples to show how to train the most popular diffusion models (see [examples](https://github.com/huggingface/diffusers/tree/main/examples)).

# Installation

Install Diffusers for with PyTorch. Support for other libraries will come in the future

🤗 Diffusers is tested on Python 3.6+, and PyTorch 1.4.0+.

## Install with pip

You should install 🤗 Diffusers in a [virtual environment](https://docs.python.org/3/library/venv.html).
If you're unfamiliar with Python virtual environments, take a look at this [guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
A virtual environment makes it easier to manage different projects, and avoid compatibility issues between dependencies.

Start by creating a virtual environment in your project directory:

```bash
python -m venv .env
```

Activate the virtual environment:

```bash
source .env/bin/activate
```

Now you're ready to install 🤗 Diffusers with the following command:

```bash
pip install diffusers
```

## Install from source

Install 🤗 Diffusers from source with the following command:

```bash
pip install git+https://github.com/huggingface/diffusers
```

This command installs the bleeding edge `main` version rather than the latest `stable` version.
The `main` version is useful for staying up-to-date with the latest developments.
For instance, if a bug has been fixed since the last official release but a new release hasn't been rolled out yet.
However, this means the `main` version may not always be stable.
We strive to keep the `main` version operational, and most issues are usually resolved within a few hours or a day.
If you run into a problem, please open an [Issue](https://github.com/huggingface/transformers/issues) so we can fix it even sooner!

## Editable install

You will need an editable install if you'd like to:

* Use the `main` version of the source code.
* Contribute to 🤗 Diffusers and need to test changes in the code.

Clone the repository and install 🤗 Diffusers with the following commands:

```bash
git clone https://github.com/huggingface/diffusers.git
cd transformers
pip install -e .
```

These commands will link the folder you cloned the repository to and your Python library paths.
Python will now look inside the folder you cloned to in addition to the normal library paths.
For example, if your Python packages are typically installed in `~/anaconda3/envs/main/lib/python3.7/site-packages/`, Python will also search the folder you cloned to: `~/diffusers/`.

<Tip warning={true}>

You must keep the `diffusers` folder if you want to keep using the library.

</Tip>

Now you can easily update your clone to the latest version of 🤗 Diffusers with the following command:

```bash
cd ~/diffusers/
git pull
```

Your Python environment will find the `main` version of 🤗 Diffuers on the next run.

Loading

0 comments on commit c3d78cd

Please sign in to comment.