Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Mask #31

Open
firekeepers opened this issue Aug 22, 2023 · 5 comments
Open

Question about Mask #31

firekeepers opened this issue Aug 22, 2023 · 5 comments

Comments

@firekeepers
Copy link

Thank you for sharing the job!
I wonder how the codes achieve the Mask extraction and use mask to guide cross attention maps
I check the demo code but don't find the related code

@ljzycmd
Copy link
Collaborator

ljzycmd commented Aug 22, 2023

Hi @firekeepers, the code of mask-guided mutual self-attention can be found

class MutualSelfAttentionControlMask(MutualSelfAttentionControl):
and
class MutualSelfAttentionControlMaskAuto(MutualSelfAttentionControl):

Note that the mask is used to restrict query regions during the mutual self-attention process, rather than to guide cross-attention maps. Meanwhile, you can use external extracted masks with the MutualSelfAttentionControlMask editor, or the masks are automatically extracted from cross-attention maps with the MutualSelfAttentionControlMaskAuto editor.

@firekeepers
Copy link
Author

Hi @firekeepers, the code of mask-guided mutual self-attention can be found

class MutualSelfAttentionControlMask(MutualSelfAttentionControl):

and

class MutualSelfAttentionControlMaskAuto(MutualSelfAttentionControl):

Note that the mask is used to restrict query regions during the mutual self-attention process, rather than to guide cross-attention maps. Meanwhile, you can use external extracted masks with the MutualSelfAttentionControlMask editor, or the masks are automatically extracted from cross-attention maps with the MutualSelfAttentionControlMaskAuto editor.

thx for your reply, I have another question and want to know how can I use this cross-attnention mask to guide img2img generation?I try some experiment on img2img gen with the guide of mask but find the Weak correlation between the source img and generation img,and the result is terrible [图片] no matter which prompt to guide the ddim inversion process
expect your reply ,thx!

@ljzycmd
Copy link
Collaborator

ljzycmd commented Aug 24, 2023

Hi @firekeepers, the failure is attributed to the fact that the initial noise obtained with DDIM inversion cannot reconstruct the source image faithfully in some cases. You may refer to #30 (comment) for more detailed explanations.

@firekeepers
Copy link
Author

Hi @firekeepers, the failure is attributed to the fact that the initial noise obtained with DDIM inversion cannot reconstruct the source image faithfully in some cases. You may refer to #30 (comment) for more detailed explanations.

If I understand correctly, the cross attention mask are created by the calculation with source prompt and source generative image, and in this way the target img generated by Slightly modified target prompt can have some relationship with the source input and can be guided by the cross attention mask?
So this attention mask is more suitable in T2I and work bad in I2I because we can't build the strong relationship with the target prompt(or description about source true img) with source img?
Can I use some method to build a strong relationship attention mask to guide the I2I generation?

@ljzycmd
Copy link
Collaborator

ljzycmd commented Aug 25, 2023

Hi @firekeepers, note that the noise inverted by DDIM inversion would sometimes fail to reconstruct the source image with/without mask guidance. Therefore, the failure cannot be attributed to the mask extracted from corresponding cross-attention maps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants