-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about Mask #31
Comments
Hi @firekeepers, the code of mask-guided mutual self-attention can be found Line 114 in 2a7861d
Line 196 in 2a7861d
Note that the mask is used to restrict query regions during the mutual self-attention process, rather than to guide cross-attention maps. Meanwhile, you can use external extracted masks with the MutualSelfAttentionControlMask editor, or the masks are automatically extracted from cross-attention maps with the MutualSelfAttentionControlMaskAuto editor.
|
thx for your reply, I have another question and want to know how can I use this cross-attnention mask to guide img2img generation?I try some experiment on img2img gen with the guide of mask but find the Weak correlation between the source img and generation img,and the result is terrible [图片] no matter which prompt to guide the ddim inversion process |
Hi @firekeepers, the failure is attributed to the fact that the initial noise obtained with DDIM inversion cannot reconstruct the source image faithfully in some cases. You may refer to #30 (comment) for more detailed explanations. |
If I understand correctly, the cross attention mask are created by the calculation with source prompt and source generative image, and in this way the target img generated by Slightly modified target prompt can have some relationship with the source input and can be guided by the cross attention mask? |
Hi @firekeepers, note that the noise inverted by DDIM inversion would sometimes fail to reconstruct the source image with/without mask guidance. Therefore, the failure cannot be attributed to the mask extracted from corresponding cross-attention maps. |
Thank you for sharing the job!
I wonder how the codes achieve the Mask extraction and use mask to guide cross attention maps
I check the demo code but don't find the related code
The text was updated successfully, but these errors were encountered: