Support ControlNet LoRA #1936

huchenlei · 2023-08-21T01:11:43Z

This PR ports following change in ComfyUI(comfyanonymous/ComfyUI@d6e4b34).

There are seveal things this PR is going to achieve:

Correctly recognize ControlNet LoRA and convert its state_dict to acceptable format
Use custom linear, conv_nd operations when a ControlNet LoRA is loaded
Create yaml configs for ControlNet LoRA models

Questions to be answered:

Will this approach support composing multiple ControlNet LoRAs?
Some code is ported from an GPL license repo (Comfy). Probably need to either rewrite / contact Comfy for license conflict solution.

huchenlei · 2023-08-21T04:23:18Z

scripts/hook.py

@@ -763,7 +769,7 @@ def hacked_group_norm_forward(self, *args, **kwargs):
        gn_modules = [model.middle_block]
        model.middle_block.gn_weight = 0

-        input_block_indices = [4, 5, 7, 8, 10, 11]
+        input_block_indices = [4, 5, 7, 8, 10, 11] if not getattr(process.sd_model, 'is_sdxl', False) else [4, 5, 7, 8]


SDXL only has 9 input blocks.

huchenlei · 2023-08-21T04:24:52Z

scripts/hook.py

@@ -600,8 +606,8 @@ def forward(self, x, timesteps=None, context=None, **kwargs):
            h = aligned_adding(h, total_controlnet_embedding.pop(), require_inpaint_hijack)

            # U-Net Decoder
-            for i, module in enumerate(self.output_blocks):
-                h = th.cat([h, aligned_adding(hs.pop(), total_controlnet_embedding.pop(), require_inpaint_hijack)], dim=1)
+            for module, (hs_item, controlnet_embedding_item) in zip(self.output_blocks, reversed(list(zip(hs, total_controlnet_embedding)))):


hs has 9 tensor elements, while total_controlnet_embedding has 10 tensor elements and 2 0 padding at the end.

This change aligns the 2 arrays, but I am not sure it is the correct fix here.

That impl does not handle a tailing 0 for middle block weights. Revised the impl to calculate array length based on model type.

huchenlei · 2023-08-21T04:59:51Z

Currently got blocked on transformer arch issue. See details in #1933 (comment)

In comfy's implementation, transformer depth is different at each level.

label_emb is another issue as there are weights for that field in the LoRA state dict.

huchenlei · 2023-08-21T17:22:12Z

Ok, I have proved that the LoRA can run on A1111. Just there is a change necessary in ldm's SpatialTransformer. I will look to find a workaround that does not need to modify ldm source later.

huchenlei · 2023-08-22T04:46:10Z

Comfy UI has following line doing some extra adjustment on input emb

        if self.num_classes is not None:
            assert y.shape[0] == x.shape[0]
            emb = emb + self.label_emb(y)

Where label_emb is a network that converts adm to emb's shape.

self.label_emb = nn.Sequential(
                    nn.Sequential(
                        linear(adm_in_channels, time_embed_dim, dtype=self.dtype),
                        nn.SiLU(),
                        linear(time_embed_dim, time_embed_dim, dtype=self.dtype),
                    )
                )

And adm in comfy is calculated differently for refiner and base SDXL models:
https://github.com/comfyanonymous/ComfyUI/blob/763b0cf024c8fd462343ab0a8cfdab099714168b/comfy/model_base.py#L157C1-L207

I am pretty sure something similar must also exists in A1111 and we just need to perform some mapping to get it to a format that label_emb network accepts.

However, from the experiment commenting out the +self.label_emb(y) line, the result does not seem to be that different using the base model (left is no label_emb, right is with label_emb):

Both result follows the guidance map. So probably we can live without it for now.

huchenlei · 2023-08-22T19:09:24Z

Some testings on rank128 size S models:

Canny:

Depth:

Re-color:

Using multiple units seems to be fine, but produce result with generally worse qualify. (Kinda expected?)

@lllyasviel PTAL

FurkanGozukara · 2023-08-22T23:37:58Z

thank you so much amazing work

George0726 · 2023-08-22T23:51:05Z

thank you so much amazing and fast work

lllyasviel · 2023-08-23T01:21:45Z

I need to take a look if there is something wrong in the impl before it is going to merge, the saturation seems a bit weird and I am not sure that some of those artifacts are limitations of the released models. Also, we should begin to merge those after webui 1.6.x since there are much differences between 1.5.x and 1.6.x

huchenlei · 2023-08-23T02:12:45Z

I need to take a look if there is something wrong in the impl before it is going to merge, the saturation seems a bit weird and I am not sure that some of those artifacts are limitations of the released models. Also, we should begin to merge those after webui 1.6.x since there are much differences between 1.5.x and 1.6.x

I tested the generations under 1.5.1 release tag and current HEAD of dev branch, which produces similar results.

I am not sure about the saturation issue. You can checkout the branch and do some testing yourself. My prompt is very simple here
1man holding microphone
neg: abstract, black and white

The comfy UI results I attached eariler is probably not comparable as the workflow they offer is img2img. Here are some canny img2img generation results with A1111:

🚧 handle lora statedict 🔧 Fix yaml config issue Workaround all obvious errors Correctly overwrite Linear/Conv2D wip 🐛 Fix transformer depth issue 🐛 Fix SDXL middle block missing issue rename config nits

lllyasviel · 2023-08-23T09:59:58Z

where you find control-lora-sdxl.yaml?

lllyasviel · 2023-08-23T10:09:54Z

wait. it seems that they are using a lora to construct a controlnet to control xl ...
why dont they directly use a lora to control xl?

FurkanGozukara · 2023-08-23T10:17:58Z

wait. it seems that they are using a lora to construct a controlnet to control xl ... why dont they directly use a lora to control xl?

i don't think so they are pro as you

so many people right now expecting controlnet for auto1111

lllyasviel · 2023-08-23T10:57:52Z

to be continued in #1952

huchenlei · 2023-08-23T13:11:23Z

where you find control-lora-sdxl.yaml?

I dumped the config out of Comfy and made that yaml.

huchenlei marked this pull request as draft August 21, 2023 01:12

huchenlei mentioned this pull request Aug 21, 2023

[SDXL] Hardcoded input_block indices in hook.py #1937

Closed

huchenlei commented Aug 21, 2023

View reviewed changes

catboxanon mentioned this pull request Aug 22, 2023

[Feature Request]: Add ability to use control-lora in diffusers AUTOMATIC1111/stable-diffusion-webui#12683

Closed

1 task

huchenlei force-pushed the control_lora branch from d6eb229 to 8f1874a Compare August 22, 2023 17:23

huchenlei mentioned this pull request Aug 22, 2023

[Inquiry] License issue on porting change to sd-webui-controlnet comfyanonymous/ComfyUI#1299

Closed

huchenlei changed the title ~~[WIP] ControlNet LoRA~~ Support ControlNet LoRA Aug 22, 2023

huchenlei requested a review from lllyasviel August 22, 2023 18:50

huchenlei added the enhancement New feature or request label Aug 22, 2023

huchenlei marked this pull request as ready for review August 22, 2023 18:50

🔥 Remove SDXL warning

07f3be2

🚧 handle lora statedict 🔧 Fix yaml config issue Workaround all obvious errors Correctly overwrite Linear/Conv2D wip 🐛 Fix transformer depth issue 🐛 Fix SDXL middle block missing issue rename config nits

huchenlei force-pushed the control_lora branch from 7295c9c to 07f3be2 Compare August 23, 2023 02:15

lllyasviel closed this Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ControlNet LoRA #1936

Support ControlNet LoRA #1936

huchenlei commented Aug 21, 2023 •

edited

Loading

huchenlei Aug 21, 2023

huchenlei Aug 21, 2023

huchenlei Aug 22, 2023

huchenlei commented Aug 21, 2023

huchenlei commented Aug 21, 2023

huchenlei commented Aug 22, 2023 •

edited

Loading

huchenlei commented Aug 22, 2023

FurkanGozukara commented Aug 22, 2023

George0726 commented Aug 22, 2023

lllyasviel commented Aug 23, 2023 •

edited

Loading

huchenlei commented Aug 23, 2023

lllyasviel commented Aug 23, 2023

lllyasviel commented Aug 23, 2023

FurkanGozukara commented Aug 23, 2023

lllyasviel commented Aug 23, 2023

huchenlei commented Aug 23, 2023

Support ControlNet LoRA #1936

Support ControlNet LoRA #1936

Conversation

huchenlei commented Aug 21, 2023 • edited Loading

huchenlei Aug 21, 2023

Choose a reason for hiding this comment

huchenlei Aug 21, 2023

Choose a reason for hiding this comment

huchenlei Aug 22, 2023

Choose a reason for hiding this comment

huchenlei commented Aug 21, 2023

huchenlei commented Aug 21, 2023

huchenlei commented Aug 22, 2023 • edited Loading

huchenlei commented Aug 22, 2023

FurkanGozukara commented Aug 22, 2023

George0726 commented Aug 22, 2023

lllyasviel commented Aug 23, 2023 • edited Loading

huchenlei commented Aug 23, 2023

lllyasviel commented Aug 23, 2023

lllyasviel commented Aug 23, 2023

FurkanGozukara commented Aug 23, 2023

lllyasviel commented Aug 23, 2023

huchenlei commented Aug 23, 2023

huchenlei commented Aug 21, 2023 •

edited

Loading

huchenlei commented Aug 22, 2023 •

edited

Loading

lllyasviel commented Aug 23, 2023 •

edited

Loading