- 利用掩码自适应注意力机制,构建可靠的无监督、跨模态语义匹配关系,进而用于样例引导式图像翻译,提升内容图像与样例图像不同区域间的匹配关系;
- 利用质量-风格联合对比学习,学习高质量的风格表征,用于全局风格调制;
- 在油画、国画、虚拟试衣等任务中,显著提升了生成质量。
We present a novel framework for exemplar based image translation. Recent advanced methods for this task mainly focus on establishing cross-domain semantic correspondence, which sequentially dominates image generation in the manner of local style control. Unfortunately, cross-domain semantic matching is challenging; and matching errors ultimately degrade the quality of generated images. To overcome this challenge, we improve the accuracy of matching on the one hand, and diminish the role of matching in image generation on the other hand. To achieve the former, we propose a masked and adaptive transformer (MAT) for learning accurate cross-domain correspondence, and executing context-aware feature augmentation. To achieve the latter, we use source features of the input and global style codes of the exemplar, as supplementary information, for decoding an image. Besides, we devise a novel contrastive style learning method, for acquire quality-discriminative style representations, which in turn benefit high-quality image generation.
Chang Jiang, Fei Gao*, Biao Ma, Yuhao Lin, Nannan Wang, Gang Xu, "Masked and Adaptive Transformer for Exemplar Based Image Translation," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 22418-22427.
- same results:
- More Results:
We offer more results here: Google Drive
- Linux or macOS
- Python 3.8
- Pytorch 1.8
- CPU or NVIDIA GPU + CUDA CuDNN
-
Clone this repo:
git clone https://github.com/AiArt-HDU/MATEBIT cd MATEBIT
-
VGG model for computing loss. Download from here, move it to models/
-
for the preparation of datasets.please refer to CocosNet
- The pre-trained model need to be save at
./checkpoint
-
Dataset Download from here.
-
Retrieval_pairs same as Celebahq (edge-to-face)
-
Train_Val split same as Celebahq (edge-to-face)
-
Run the following command. Note that
dataset_path
is your celebahq root, e.g./data/Dataset/CelebAMask-HQ
.
python train.py --name celebahqedge --dataset_mode celebahqedge --PONO --PONO_C --amp --batchSize 4 --netG dynast --load_size 286 --crop_size 256 --dataroot root_path --contrastive_weight 100.0 --label_nc 15 --niter 30 --niter_decay 30 --gpu_ids 0 --use_atten --vgg_normal_correct --style_weight 0.1 --weight_warp_self 1000.0 --weight_perceptual 0.001 --vgg_path vgg/vgg19_conv.pth --continue_train
python test.py --name celebahqedge --dataset_mode celebahqedge --PONO --PONO_C --amp --batchSize 4 --netG dynast --load_size 256 --crop_size 256 --dataroot root_path --no_flip --which_epoch latest --save_per_img
-
Dataset Download DeepFashion, we use OpenPose to estimate pose of DeepFashion. Download and unzip openpose results, then move folder
pose/
toDeepFashion/
-
Retrieval_pairs Download
deepfashion_ref.txt
,deepfashion_ref_test.txt
anddeepfashion_self_pair.txt
from here, save or replace them indata/
-
Train_Val split Download
train.txt
andval.txt
from here, save them inDeepFashion/
python train.py --PONO --PONO_C --no_flip --video_like --vgg_normal_correct --video_like --nThreads 40 --amp --display_winsize 256 --load_size 286 --crop_size 256 --label_nc 3 --batchSize 80 --gpu_ids 0,1,2,3,4,5,6,7 --netG dynast --niter 100 --niter_decay 100 --vgg_path vgg/vgg19_conv.pth --n_layers 3 --use_atten --contrastive_weight 100.0 --style_weight 0.2 --weight_perceptual 0.01 --continue_train --display_freq 5000
python test.py --PONO --PONO_C --no_flip --video_like --vgg_normal_correct --video_like --nThreads 16 --amp --display_winsize 256 --load_size 286 --crop_size 256 --label_nc 3 --batchSize 4 --which_epoch latest --save_per_img
-
Download Metfaces AAHQ Ukiyo-e faces
-
Brush painting, traditional art painting
We obtained a dataset of 915 traditional Chinese brush paintings with a resolution of 512 size from the Internet. Google Drive
retrieve similar reference image, After making the labels, you can train the rest of the dataset.
If you use this code for your research, please cite our paper.
@inproceedings{jiang2023masked,
title={Masked and Adaptive Transformer for Exemplar Based Image Translation},
author={Jiang, Chang and Gao, Fei and Ma, Biao and Lin, Yuhao and Wang, Nannan and Xu, Gang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={22418--22427},
year={2023}
}
This code borrows heavily from DynaST and MMTN. We also thank the implementation of Synchronized Batch Normalization.