Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么要使用提取到的clean content feature与U-net模型做cross-attention? #80

Open
HetaoAOzi opened this issue Oct 31, 2024 · 1 comment

Comments

@HetaoAOzi
Copy link

作者你好,你们的工作非常有创造性,很有意义。文章中使用da-clip提取两种特征做更广泛的图像恢复模型我可以理解,但是我想知道为什么要使用da_clip提取到的clean content feature与U-net模型做cross-attention,由代码可知U-net模型是用来提取noisy的,在提取noisy时加入clean content feature的信息我感觉有点奇怪,直觉来说不应该与da_clip提取到的degra_feature做cross-attention?。而且我发现用da_clip提取干净图像和噪声图像对的clean content feature做余弦相似度计算,相似度会随着噪声强度的增加而降低,这是不是证明da_clip提取的clean content feature是含有一定噪声的,而这点恰好促成了文章中证明的——使用da_clip提取到的clean content feature与U-net模型做cross-attention这一方法的对恢复结果的效果提升。另外我想知道使用da_clip提取到的degra_feature与time_embedding做结合的方法背后的考量。这两种特征加入背后的原因在文章中没有做更详细的解释,希望作者能在百忙之中回复我一下,感激不尽!

@HetaoAOzi HetaoAOzi reopened this Oct 31, 2024
@Algolzw
Copy link
Owner

Algolzw commented Oct 31, 2024

你好,图像复原中预测的噪声其实应该是加在clean image上的噪声,LQ图像只是作为condition,所以使用了clean content feature。此外degra_feature可以视作label embedding,其与time_embedding结合可以参考guided diffusion: https://github.com/openai/improved-diffusion/blob/1bc7bbbdc414d83d4abf2ad8cc1446dc36c4e4d5/improved_diffusion/unet.py#L510。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants