Skip to content

Commit

Permalink
Merge pull request #129 from huggingface/main
Browse files Browse the repository at this point in the history
Merge changes
  • Loading branch information
Skquark authored Nov 27, 2023
2 parents 4156158 + 50a749e commit 39afd41
Show file tree
Hide file tree
Showing 101 changed files with 6,772 additions and 414 deletions.
6 changes: 5 additions & 1 deletion .github/workflows/pr_test_fetcher.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Fast tests for PRs
name: Fast tests for PRs - Test Fetcher

on:
pull_request:
Expand All @@ -14,6 +14,10 @@ env:
MKL_NUM_THREADS: 4
PYTEST_TIMEOUT: 60

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
setup_pr_tests:
name: Setup PR Tests
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/push_tests_fast.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ on:
branches:
- main

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

env:
DIFFUSERS_IS_CI: yes
HF_HOME: /mnt/cache
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/push_tests_mps.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ env:
PYTEST_TIMEOUT: 600
RUN_SLOW: no

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
run_fast_tests_apple_m1:
name: Fast PyTorch MPS tests on MacOS
Expand Down
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,8 @@
title: Kandinsky 2.1
- local: api/pipelines/kandinsky_v22
title: Kandinsky 2.2
- local: api/pipelines/kandinsky3
title: Kandinsky 3
- local: api/pipelines/latent_consistency_models
title: Latent Consistency Models
- local: api/pipelines/latent_diffusion
Expand Down
24 changes: 24 additions & 0 deletions docs/source/en/api/pipelines/kandinsky3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Kandinsky 3

TODO

## Kandinsky3Pipeline

[[autodoc]] Kandinsky3Pipeline
- all
- __call__

## Kandinsky3Img2ImgPipeline

[[autodoc]] Kandinsky3Img2ImgPipeline
- all
- __call__
106 changes: 106 additions & 0 deletions docs/source/en/api/pipelines/pixart.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,112 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers)

</Tip>

## Inference with under 8GB GPU VRAM

Run the [`PixArtAlphaPipeline`] with under 8GB GPU VRAM by loading the text encoder in 8-bit precision. Let's walk through a full-fledged example.

First, install the [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) library:

```bash
pip install -U bitsandbytes
```

Then load the text encoder in 8-bit:

```python
from transformers import T5EncoderModel
from diffusers import PixArtAlphaPipeline
import torch

text_encoder = T5EncoderModel.from_pretrained(
"PixArt-alpha/PixArt-XL-2-1024-MS",
subfolder="text_encoder",
load_in_8bit=True,
device_map="auto",

)
pipe = PixArtAlphaPipeline.from_pretrained(
"PixArt-alpha/PixArt-XL-2-1024-MS",
text_encoder=text_encoder,
transformer=None,
device_map="auto"
)
```

Now, use the `pipe` to encode a prompt:

```python
with torch.no_grad():
prompt = "cute cat"
prompt_embeds, prompt_attention_mask, negative_embeds, negative_prompt_attention_mask = pipe.encode_prompt(prompt)
```

Since text embeddings have been computed, remove the `text_encoder` and `pipe` from the memory, and free up som GPU VRAM:

```python
import gc

def flush():
gc.collect()
torch.cuda.empty_cache()

del text_encoder
del pipe
flush()
```

Then compute the latents with the prompt embeddings as inputs:

```python
pipe = PixArtAlphaPipeline.from_pretrained(
"PixArt-alpha/PixArt-XL-2-1024-MS",
text_encoder=None,
torch_dtype=torch.float16,
).to("cuda")

latents = pipe(
negative_prompt=None,
prompt_embeds=prompt_embeds,
negative_prompt_embeds=negative_embeds,
prompt_attention_mask=prompt_attention_mask,
negative_prompt_attention_mask=negative_prompt_attention_mask,
num_images_per_prompt=1,
output_type="latent",
).images

del pipe.transformer
flush()
```

<Tip>

Notice that while initializing `pipe`, you're setting `text_encoder` to `None` so that it's not loaded.

</Tip>

Once the latents are computed, pass it off to the VAE to decode into a real image:

```python
with torch.no_grad():
image = pipe.vae.decode(latents / pipe.vae.config.scaling_factor, return_dict=False)[0]
image = pipe.image_processor.postprocess(image, output_type="pil")[0]
image.save("cat.png")
```

By deleting components you aren't using and flushing the GPU VRAM, you should be able to run [`PixArtAlphaPipeline`] with under 8GB GPU VRAM.

![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/pixart/8bits_cat.png)

If you want a report of your memory-usage, run this [script](https://gist.github.com/sayakpaul/3ae0f847001d342af27018a96f467e4e).

<Tip warning={true}>

Text embeddings computed in 8-bit can impact the quality of the generated images because of the information loss in the representation space caused by the reduced precision. It's recommended to compare the outputs with and without 8-bit.

</Tip>

While loading the `text_encoder`, you set `load_in_8bit` to `True`. You could also specify `load_in_4bit` to bring your memory requirements down even further to under 7GB.

## PixArtAlphaPipeline

[[autodoc]] PixArtAlphaPipeline
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/schedulers/score_sde_vp.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,4 @@ The abstract from the paper is:
</Tip>

## ScoreSdeVpScheduler
[[autodoc]] schedulers.scheduling_sde_vp.ScoreSdeVpScheduler
[[autodoc]] schedulers.deprecated.scheduling_sde_vp.ScoreSdeVpScheduler
2 changes: 1 addition & 1 deletion docs/source/en/api/schedulers/stochastic_karras_ve.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ specific language governing permissions and limitations under the License.
[[autodoc]] KarrasVeScheduler

## KarrasVeOutput
[[autodoc]] schedulers.scheduling_karras_ve.KarrasVeOutput
[[autodoc]] schedulers.deprecated.scheduling_karras_ve.KarrasVeOutput
49 changes: 18 additions & 31 deletions docs/source/en/using-diffusers/unconditional_image_generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,54 +14,41 @@ specific language governing permissions and limitations under the License.

[[open-in-colab]]

Unconditional image generation is a relatively straightforward task. The model only generates images - without any additional context like text or an image - resembling the training data it was trained on.
Unconditional image generation generates images that look like a random sample from the training data the model was trained on because the denoising process is not guided by any additional context like text or image.

The [`DiffusionPipeline`] is the easiest way to use a pre-trained diffusion system for inference.
To get started, use the [`DiffusionPipeline`] to load the [anton-l/ddpm-butterflies-128](https://huggingface.co/anton-l/ddpm-butterflies-128) checkpoint to generate images of butterflies. The [`DiffusionPipeline`] downloads and caches all the model components required to generate an image.

Start by creating an instance of [`DiffusionPipeline`] and specify which pipeline checkpoint you would like to download.
You can use any of the 🧨 Diffusers [checkpoints](https://huggingface.co/models?library=diffusers&sort=downloads) from the Hub (the checkpoint you'll use generates images of butterflies).
```py
from diffusers import DiffusionPipeline

generator = DiffusionPipeline.from_pretrained("anton-l/ddpm-butterflies-128").to("cuda")
image = generator().images[0]
image
```

<Tip>

💡 Want to train your own unconditional image generation model? Take a look at the training [guide](../training/unconditional_training) to learn how to generate your own images.
Want to generate images of something else? Take a look at the training [guide](../training/unconditional_training) to learn how to train a model to generate your own images.

</Tip>

In this guide, you'll use [`DiffusionPipeline`] for unconditional image generation with [DDPM](https://arxiv.org/abs/2006.11239):

```python
from diffusers import DiffusionPipeline

generator = DiffusionPipeline.from_pretrained("anton-l/ddpm-butterflies-128", use_safetensors=True)
```
The output image is a [`PIL.Image`](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class) object that can be saved:

The [`DiffusionPipeline`] downloads and caches all modeling, tokenization, and scheduling components.
Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on a GPU.
You can move the generator object to a GPU, just like you would in PyTorch:

```python
generator.to("cuda")
```py
image.save("generated_image.png")
```

Now you can use the `generator` to generate an image:
You can also try experimenting with the `num_inference_steps` parameter, which controls the number of denoising steps. More denoising steps typically produce higher quality images, but it'll take longer to generate. Feel free to play around with this parameter to see how it affects the image quality.

```python
image = generator().images[0]
```py
image = generator(num_inference_steps=100).images[0]
image
```

The output is by default wrapped into a [`PIL.Image`](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class) object.

You can save the image by calling:

```python
image.save("generated_image.png")
```

Try out the Spaces below, and feel free to play around with the inference steps parameter to see how it affects the image quality!
Try out the Space below to generate an image of a butterfly!

<iframe
src="https://stevhliu-ddpm-butterflies-128.hf.space"
src="https://stevhliu-unconditional-image-generation.hf.space"
frameborder="0"
width="850"
height="500"
Expand Down
Loading

0 comments on commit 39afd41

Please sign in to comment.