一些不针对特定任务的图像相关论文, 如滤波, 新的目标函数等
- General DL Algorithms
- General Traditional Algorithms
- Image Processing on Device
- Image Quality Evaluators
- Visual Large Model
-
MAXIM: Multi-Axis MLP for Image Processing
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li
[CVPR 2022 Oral] [Pytorch-Code]
🔥 提出一个多轴门控MLP模块, 并行地提取局部和全局信息, 计算复杂度与图像尺寸成线性关系. 采用UNet+渐进式设计. 再各类图像处理任务上表现良好. -
Procedural Kernel Networks
Bartlomiej Wronski
[arXiv 2112]
[PKN] [★★] 1. 提出了一个框架, 用一个轻量级CNN, 在小图上预测局部参数, 将预测的参数作用到一些传统图像处理算子上, 如upscaling, denoising, deblur等. 相对于全局用相同参数处理, 局部参数往往能有更好的效果. 2. 方法的一大优势是将计算量与图像分辨率解耦, 对移动端处理较友好. 3. 可以与各种传统算子结合, 整体算法更有可解释性和调节性, 特别适用于一些已经固化到硬件上的算法. 4. 不足之处是, 上限不会超过oracle optimization的性能, 在很多任务上指标达不到sota, 另外目前只在可微的算子上做了实验. -
Revisiting Global Statistics Aggregation for Improving Image Restoration
Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu
[arXiv 2112] [Pytorch-Code]
[TLSC] [★] 一般来说, 训练时使用patch, 测试时用全图, 对一些全局操作, 如SE, IN等会带来统计不一致. 本文提出用局部aggregation代替全局. -
IICNet - Invertible Image Conversion Net
Ka Leong Cheng, Yueqi Xie, Qifeng Chen
[ICCV 2021] [Pytorch-Code] -
Focal Frequency Loss for Image Reconstruction and Synthesis
Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy
[ICCV 2021] [Project] [] [Pytorch-Code]
[★] 提出在频率空间中计算欧式距离做loss, 另外提出了一个weight map, 其实就是距离大weight大, 距离小weight小 -
Projected Distribution Loss for Image Enhancement
Mauricio Delbracio, Hossein Talebi, Peyman Milanfar
[arXiv 2013] [Project] [] [Unofficial-Pytorch-Code]
[PDL Loss] [★] 提出1D-Wasserstein distances loss, 提高视觉效果 -
Positional Encoding as Spatial Inductive Bias in GANs
Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy
[CVPR 2021] [Project]
[MS-PIE] [★☆] 很有趣的一篇论文, 没有完全理解, 可以看完相关论文后再精读一下. Zero Padding在图像生成中会提供一种隐式的位置编码信息, 但这种信息在图像边缘和中间部分的程度是不一样的, 所以不是最优位置编码方案. 文中提出使用正弦函数的形式作为位置编码, 这样在图像的所有区域, 两个像素间的关系就完全取决于二者的距离, 而且不随图像scale的变化而变化. 粗读之后有两个疑问: 1. 卷积操作的计算结果取决于两个像素的相对位置关系, 所以感觉卷积本身已经挖掘了相对位置信息了, 那么本文这种相对位置编码的注入的意义应该怎么理解? 2. 感觉本文提出的方案适用于生成固定大小的目标, 如100x100的热气球, 但如果需求是根据图像大小生成不同大小的目标, 如1024的人脸和2048的人脸, 这种相对位置编码方案的有效性如何呢? -
Mind the Pad -- CNNs Can Develop Blind Spots
Bilal Alsallakh, Narine Kokhlikyan, Vivek Miglani, Jun Yuan, Orion Reblitz-Richardson
[ICLR 2021] [Project]
[★☆] -
How much Position Information Do Convolutional Neural Networks Encode?
Md Amirul Islam, Matthew Kowal, Konstantinos G. Derpanis, Neil Bruce
[ICLR 2020]
[★☆] 提出CNN能够利用zero pad编码绝对位置信息, 位置信息对于提高分割解析的精度有帮助 -
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
Osman Semih Kayhan, Jan van Gemert
[CVPR 2020] [Pytorch-Code]
[★] -
Invertible Image Rescaling
Mingqing Xiao, Shuxin Zheng, Chang Liu, Yaolong Wang, Di He, Guolin Ke, Jiang Bian, Zhouchen Lin, Tie-Yan Liu
[ECCV 2020] [Pytorch-Code]
[★★] 提出一个基于小波和可逆神经网络(INN)的图像降采样恢复方法. -
xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
Idan Kligvasser, Tamar Rott Shaham, Tomer Michaeli
[CVPR 2018 Spotlight] [Pytorch-Code]
[★] (新模块) 提出了一个根据图像自适应的激活操作, 有点类似attention -
CIE XYZ Net: Unprocessing Images for Low-Level Computer Vision Tasks
Mahmoud Afifi, Abdelrahman Abdelhamed, Abdullah Abuolaim, Abhijith Punnappurath, Michael S. Brown
[arXiv 2006] [Matlab & Pytorch-Code]
[★] 使用CNN, 把sRGB变换到设备无关的CIE XYZ空间, 在此scene-referred空间中进行去噪增强等处理, 再用一CNN变换回sRGB空间 -
Spatially Variant Linear Representation Models for Joint Filtering
Jinshan Pan, Jiangxin Dong, Jimmy S. Ren, Liang Lin, Jinhui Tang, Ming-Hsuan Yang
[CVPR 2019] [Project]
[SVLRM] [★] 大致浏览, 用CNN预测guided filter中的系数A和b. -
Deep Network Interpolation
Xintao Wang, Ke Yu, Chao Dong, Xiaoou Tang, Chen Change Loy
[CVPR 2019] [Project]
[DNI] [★☆] 大致浏览, 通过对filter插值取得两个或多个任务间过渡的效果. 为保证两个filter直接有强相关性,用一个模型作为另一个模型的pretrain model. -
Self-Guided Network for Fast Image Denoising
Shuhang Gu, Yawei Li, Luc Van Gool, Radu Timofte
[ICCV 2019] [Pytorch-Code]
[SGN] -
Fast Image Restoration with Multi-bin Trainable Linear Units
Shuhang Gu, Wen Li, Luc Van Gool, Radu Timofte
[ICCV 2019] [Pytorch-Code]
[MTLU] [★] -
Fast End-to-End Trainable Guided Filter
Huikai Wu, Shuai Zheng, Junge Zhang, Kaiqi Huang
[CVPR 2018] [Code]
[★★] 可训练的引导滤波, 用于联合上采样. 可用于各种像素级的增强任务中. 同时提供了用TensorFlow实现的原始guided filter -
The Perception-Distortion Tradeoff
Yochai Blau, Tomer Michaeli
[CVPR 2018]
[★★] 1) 大致浏览, 提出在image restoration中, perception和distortion存在tradeoff. 对不同的loss这种tradedoff的严重程度不同, 如perceptual loss与MSE loss相比能在perception和distortion直接取得更好的平衡. 2) 很多理论都还没看, 日后如果研究这一方向, 可以仔细读一下. -
Decouple Learning for Parameterized Image Operators
Qingnan Fan, Dongdong Chen, Lu Yuan, Gang Hua, Nenghai Yu, Baoquan Chen
[ECCV 2018] [PyTorch-Code]
[★] 粗读, 貌似是给不同任务设定一个parameter, 用网络以parameter为输入预测每层的weight, 这个weight作为instance norm的weight对每层做归一化. -
The Contextual Loss for Image Transformation with Non-Aligned Data
Roey Mechrez, Itamar Talmi, Firas Shama, Lihi Zelnik-Manor
[ECCV 2018 Oral] [Project] [Code]
[★★] 提出了一个处理非对齐数据的loss, 利用特征(用VGG19获得)的距离定义两像素特征点的相似度, 并在此基础上定义loss, 以解决输入和真值在空间上不对齐的问题. -
Loss Functions for Image Restoration with Neural Networks
Hang Zhao, Orazio Gallo, Iuri Frosio, Jan Kautz
[ICT 2017] [Caffe-code]
[PL4NN] [★★] 分析了L1, L2, SSIM和MS-SSIM几个loss在图像恢复任务中的优劣 -
Fast Image Processing with Fully-Convolutional Networks
Qifeng Chen, Jia Xu, Vladlen Koltun
[ICCV 2017] [Project]
[★★] 较早用CNN做图像滤波增强的paper之一, 使用了dilation conv提取全局信息. -
Deep Joint Image Filtering
Yijun Li, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang
[ECCV 2016] [Project] [Code] -
Learning Recursive Filters for Low-Level Vision via a Hybrid Neural Network
Sifei Liu, Jinshan Pan, Ming-Hsuan Yang
[ECCV 2016 Oral] [Project] [Code]
-
BBAND Index: A No-Reference Banding Artifact Predictor
Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, Alan C. Bovik
[ICASSP 2020]
[★] 一个评估banding程度的视频质量评估指标. -
Blade: Filter learning for general purpose computational photography
Pascal Getreuer, Ignacio Garcia-Dorado, John Isidoro, Sungjoon Choi, Frank Ong, Peyman Milanfar
[ICCP 2018]
[★☆] 类似RAISER, 训练多个filter组成哈希表, 测试时对每个pixel, 提取特征并量化, 并以此作为哈希表的key, 查询最优的filter. -
Misalignment-Robust Joint Filter for Cross-Modal Image Pairs
Takashi Shibata, Masayuki Tanaka, Masatoshi Okutomi
[ICCV 2017]
[★☆] 1) 提出一种适用于不对齐多模数据的联合滤波方法, 可结合引导滤波等优良滤波算法, 在非对齐不同源数据上达到很好的滤波效果. 2) 算法的思路其实就是计算cost volume并对其进行加权求和. 其最好版本的大体思路为: 将引导图上下左右位移组成k个移位引导图, 1.计算target和k个引导图的距离(NCC等)组成cost volume. 2.从cost volume计算weight volume, 并通过最小化能量函数的方法对其进行优化. 3.用k个移位引导图分别对target进行滤波.4.用weight volume对k个滤波输出进行加权平均, 生成最后的输出. 3) 从paper中看, 该方法对非对齐的多模数据滤波效果不错, 可以在设计DL方案时作为参考. 4) 算法的局限: 1.weight volume优化的步骤过于耗时; 2. cost volume的准确性仍依赖于距离的计算准则, 现有的例如NCC等策略也不能完美解决多模数据的相似性度量问题. -
Bilateral Guided Upsampling
Jiawen Chen, Andrew Adams, Neal Wadhwa, Samuel W. Hasinoff
[SIGGRAPH 2016 Asia] [Matlab-Code]
[BGU] [★★☆] 受Guided Filter启发, 假设输入和经过某种处理的输出在一个小的局部区域内可以由一个线性映射近似, 并且在bilateral space中, 相邻cell之间的映射系数应该是平滑的. 据此提出了由数据项和平滑项组成的目标函数, 可以通过最小二乘法求解, 另外还提出了一个快速近似版本. 优化部分没有看懂. -
Reproduction Angular Error: An Improved Performance Metric for Illuminant Estimation
Graham Finlayson, Roshanak Zakizadeh
[BMVC 2014]
[★] 1) 提出了一个用于评估illuminant estimation性能的准则, 该准则与光源的色温无关. 大致浏览, 一些原理没看懂. 2) 后面Google在此基础上做了改进, 作为loss去训练低光照时AWB模型. -
Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid
Sylvain Paris, Samuel W. Hasinoff, Jan Kautz
[SIGGRAPH 2011] [Project]
Related [Fast Local Laplacian Filters]
[LLF] [★★★] 著名的LLF -
Guided Image Filtering
Kaiming He, Jian Sun, Xiaoou Tang
[ECCV 2010] [Project] [TF/Pytorch-Code]
[★★★] 大名鼎鼎的引导滤波, 可用在去噪, 融合, 联合上采样, matting, 图像增强等多种任务中. 速度快, 效果好. -
Joint Bilateral Upsampling
Johannes Kopf, Michael F. Cohen, Dani Lischinski, Matt Uyttendaele
[SIGGRAPH 2007] [Project]
[JBU] [★★] 利用guided filter和bilateral filter的思想做上采样, 在小分辨率图上得到某种需要的变换(style transfer, colorization等), 恢复大图时在小图上计算spatial系数, 在原分辨率图上计算range系数. -
Real-time edge-aware image processing with the bilateral grid
Jiawen Chen, Sylvain Paris, Frédo Durand
[SIGGRAPH 2007] [Project]
[Bilateral Grid] [★★] 将转换到3D空间的思路离散化, 网格化, 提出bilateral grid这种数据结构. 优化了代码, 在GPU上达到实时. 在多种任务上有良好表现. -
A Fast Approximation of the Bilateral Filter using a Signal Processing Approach
Sylvain Paris, Frédo Durand
[ECCV 2006] [Code] [A Good Blog]
[★★] Bilateral filter的一种加速方法. 将2D图像的灰度值作为一个新的维度, 将原来的非线性滤波操作转化为3D空间中的线性卷积. 并且, 高斯卷积属于低通操作, 因此可以把3D网络做下采样而不损失精度, 在小分辨上进行3D卷积, 速度大大提升. -
Fast bilateral filtering for the display of high-dynamic-range images
Frédo Durand, Julie Dorsey
[SIGGRAPH 2002] [Project] [Code]
-
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
Weiran Gou, Ziyao Yi, Yan Xiang, Shaoqing Li, Zibin Liu, Dehui Kong, Ke Xu
[ICCV 2023] [PYtorch-Code]
高通 -
Collapsible Linear Blocks for Super-Efficient Super Resolution
Kartikeya Bhardwaj, Milos Milosavljevic, Alex Chalfin, Naveen Suda, Liam O'Neil, Dibakar Gope, Lingchuan Meng, Ramon Matas, Danny Loh
[arXiv 2103]
[SESR] [★☆] (overparameterization) 将一个conv分解为一个更宽的3x3 conv, 一个1x1 conv和一个shotcut连接. -
GhostSR: Learning Ghost Features for Efficient Image Super-Resolution
Ying Nie, Kai Han, Zhenhua Liu, An Xiao, Yiping Deng, Chunjing Xu, Yunhe Wang
[arXiv 2101]
[★☆] (轻量级超分) 使用pixel shift的思想做超分 -
SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices
Xin Liu, Yuang Li, Josh Fromm, Yuntao Wang, Ziheng Jiang, Alex Mariakakis, Shwetak Patel
[arXiv 2101] [Unofficial-Pytorch-Code]
[★☆] (轻量级超分) 提出了一个轻量级residual block结构: SplitSRBlock.
-
PSNR
[Wiki]
图像质量评价常用指标, 取决于MSE, 对真实视觉效果指向性不太强 -
SSIM
(TIP 2004) Image quality assessment: from error visibility to structural similarity
图像质量评价常用指标, 分为亮度, 对比度, 结构三部分, 对真实视觉效果指向性不太强 -
NIQE
(SPL 2012) Making a “Completely Blind” Image Quality Analyzer
无参考图像质量评价算法, 在超分, 去噪等任务中被广泛采纳 -
LPIPS
(CVPR 2018) The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
[Project]
用网络特征衡量图像相似度, 能更好地反映视觉质量 -
Inception Score
(arXiv 1606) Improved Techniques for Training GANs
用于评价生成的图像的清晰度和多样性. 将一批生成图像送入Inception网络中, 计算P(y|x)和P(y)的平均KL散度, 越高说明生产图像的质量越高 -
FID (Frechet Inception Distance)
(arXiv 1706) GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
用于评估生成图像和真实图像的差异. 将生成图像和真实图像分别送入Inception V3中, 将激活值看做符合多元高斯分布, 计算其均值,方差,协方差等统计量, 进而计算二者相似度. FID越低说明图像越相似.
-
CogVLM: Visual Expert for Pretrained Language Models
Weihan Wang, Qingsong Lv, Wenmeng Yu, Wenyi Hong, Ji Qi, Yan Wang, Junhui Ji, Zhuoyi Yang, Lei Zhao, Xixuan Song, Jiazheng Xu, Bin Xu, Juanzi Li, Yuxiao Dong, Ming Ding, Jie Tang
[arXiv 2311] [Pytorch-Code]
[CogVLM] 🔥 -
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian, Yi Jiang, Qishuai Diao, Chen Lin, Liwei Wang, Zehuan Yuan
[ICLR 2023 Spotlight] [Pytorch-Code]
[SparK] 🔥 -
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer
[CVPR 2022 Oral] [Project] [Pytorch-Code]
[Stable Difussion] 🔥 -
More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
[WACV 2023] [Project] [Pytorch-Code]
[Classifier Free Guidance] -
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal, Alex Nichol
[NeurIPS 2021] [Pytorch-Code]
[Classifier Guidance] -
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever
[ICML 2021] [Pytorch-Code]
[CLIP] 🔥