Skip to content

Commit

Permalink
release 0.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
gosha20777 committed Jul 5, 2024
1 parent 8550b88 commit cd00af4
Show file tree
Hide file tree
Showing 81 changed files with 8,734 additions and 3 deletions.
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -159,4 +159,11 @@ cython_debug/
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

data/
outdir/
predict.py
train.py
pretrain.py
test.ipynb
91 changes: 89 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,89 @@
# rawformer
Unpaired Raw-to-Raw Translation for Learnable Camera ISPs
# Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs

🚀 The paper was accepted to the [ECCV 2024](https://eccv.ecva.net/Conferences/2024) conference. The preprint is available on [arXiv](https://arxiv.org/abs/2404.10700). 🚀

### Authors
Georgy Perevozchikov, Nancy Mehta, Mahmoud Afifi, Radu Timofte

![Rawformer](figures/main.png)

### Abstract
*Modern smartphone camera quality heavily relies on the image signal processor (ISP) to enhance captured raw images, utilizing carefully designed modules to produce final output images encoded in a standard color space (e.g., sRGB). Neural-based end-to-end learnable ISPs offer promising advancements, potentially replacing traditional ISPs with their ability to adapt without requiring extensive tuning for each new camera model, as is often the case for nearly every module in traditional ISPs. However, the key challenge with the recent learning-based ISPs is the urge to collect large paired datasets for each distinct camera model due to the influence of intrinsic camera characteristics on the formation of input raw images. This paper tackles this challenge by introducing a novel method for unpaired learning of raw-to-raw translation across diverse cameras. Specifically, we propose Rawformer, an unsupervised Transformer-based encoder-decoder method for raw-to-raw translation. It accurately maps raw images captured by a certain camera to the target camera, facilitating the generalization of learnable ISPs to new unseen cameras. Our method demonstrates superior performance on real camera datasets, achieving higher accuracy compared to previous state-of-the-art techniques, and preserving a more robust correlation between the original and translated raw images.*

## To Do

*It is the first version of the code (0.1.0). We are working on the following tasks:*

- [x] Release Rawformer code
- [ ] Upload prepared datasets
- [ ] Upload the pre-trained models
- [ ] Rewrite the code to pytorch-lightning
- [ ] Add ONNX export scripts


## Datasets

### Data Structure
```
- data
- <dataset_name>
- trainA
- img1.jpg, img2.jpg, ...
- trainB
- imgA.jpg, imgB.jpg, ...
- trainA
- img1.jpg, img2.jpg, ...
- trainB
- imgA.jpg, imgB.jpg, ...
```

### Dowload Datasets

*Under construction*


## Pretained Models

*Under construction*


## How To Use

### Setup The Environment
```bash
git clone https://github.com/gosha20777/rawformer.git
cd rawformer
conda env create -n rawformer -f environment.yaml
conda activate rawformer
python setup.py install
```

### Train Rawformer

#### Pretrain Generator
```bash
cd experiments/<experiment_name>
python pretrain.py --batch-size 8
```

#### Train Rawformer
```bash
cd experiments/<experiment_name>
python train.py
```

#### Test Rawformer
```bash
cd experiments/<experiment_name>
python predict.py <model_path> --split test
```

## Citation
```BibTeX
@article{perevozchikov2024rawformer,
title={Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs},
author={Perevozchikov, Georgy and Mehta, Nancy and Afifi, Mahmoud and Timofte, Radu},
journal={arXiv preprint arXiv:2404.10700},
year={2024}
}
```
151 changes: 151 additions & 0 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
name: rawformer
channels:
- pytorch
- nvidia
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- blas=1.0=mkl
- bottleneck=1.3.7=py312ha883a20_0
- brotli=1.0.9=h5eee18b_8
- brotli-bin=1.0.9=h5eee18b_8
- brotli-python=1.0.9=py312h6a678d5_8
- bzip2=1.0.8=h5eee18b_6
- ca-certificates=2024.3.11=h06a4308_0
- certifi=2024.6.2=py312h06a4308_0
- charset-normalizer=2.0.4=pyhd3eb1b0_0
- contourpy=1.2.0=py312hdb19cb5_0
- cuda-cudart=12.1.105=0
- cuda-cupti=12.1.105=0
- cuda-libraries=12.1.0=0
- cuda-nvrtc=12.1.105=0
- cuda-nvtx=12.1.105=0
- cuda-opencl=12.5.39=0
- cuda-runtime=12.1.0=0
- cuda-version=12.5=3
- cycler=0.11.0=pyhd3eb1b0_0
- cyrus-sasl=2.1.28=h52b45da_1
- dbus=1.13.18=hb2f20db_0
- expat=2.6.2=h6a678d5_0
- ffmpeg=4.3=hf484d3e_0
- filelock=3.13.1=py312h06a4308_0
- fontconfig=2.14.1=h4c34cd2_2
- fonttools=4.51.0=py312h5eee18b_0
- freetype=2.12.1=h4a9f257_0
- glib=2.78.4=h6a678d5_0
- glib-tools=2.78.4=h6a678d5_0
- gmp=6.2.1=h295c915_3
- gnutls=3.6.15=he1e5248_0
- gst-plugins-base=1.14.1=h6a678d5_1
- gstreamer=1.14.1=h5eee18b_1
- icu=73.1=h6a678d5_0
- idna=3.7=py312h06a4308_0
- intel-openmp=2023.1.0=hdb19cb5_46306
- jinja2=3.1.4=py312h06a4308_0
- jpeg=9e=h5eee18b_1
- kiwisolver=1.4.4=py312h6a678d5_0
- krb5=1.20.1=h143b758_1
- lame=3.100=h7b6447c_0
- lcms2=2.12=h3be6417_0
- ld_impl_linux-64=2.38=h1181459_1
- lerc=3.0=h295c915_0
- libbrotlicommon=1.0.9=h5eee18b_8
- libbrotlidec=1.0.9=h5eee18b_8
- libbrotlienc=1.0.9=h5eee18b_8
- libclang=14.0.6=default_hc6dbbc7_1
- libclang13=14.0.6=default_he11475f_1
- libcublas=12.1.0.26=0
- libcufft=11.0.2.4=0
- libcufile=1.10.0.4=0
- libcups=2.4.2=h2d74bed_1
- libcurand=10.3.6.39=0
- libcusolver=11.4.4.55=0
- libcusparse=12.0.2.55=0
- libdeflate=1.17=h5eee18b_1
- libedit=3.1.20230828=h5eee18b_0
- libffi=3.4.4=h6a678d5_1
- libgcc-ng=11.2.0=h1234567_1
- libglib=2.78.4=hdc74915_0
- libgomp=11.2.0=h1234567_1
- libiconv=1.16=h5eee18b_3
- libidn2=2.3.4=h5eee18b_0
- libjpeg-turbo=2.0.0=h9bf148f_0
- libllvm14=14.0.6=hdb19cb5_3
- libnpp=12.0.2.50=0
- libnvjitlink=12.1.105=0
- libnvjpeg=12.1.1.14=0
- libpng=1.6.39=h5eee18b_0
- libpq=12.17=hdbd6064_0
- libstdcxx-ng=11.2.0=h1234567_1
- libtasn1=4.19.0=h5eee18b_0
- libtiff=4.5.1=h6a678d5_0
- libunistring=0.9.10=h27cfd23_0
- libuuid=1.41.5=h5eee18b_0
- libwebp-base=1.3.2=h5eee18b_0
- libxcb=1.15=h7f8727e_0
- libxkbcommon=1.0.1=h5eee18b_1
- libxml2=2.10.4=hfdd30dd_2
- llvm-openmp=14.0.6=h9e868ea_0
- lz4-c=1.9.4=h6a678d5_1
- markupsafe=2.1.3=py312h5eee18b_0
- matplotlib=3.8.4=py312h06a4308_0
- matplotlib-base=3.8.4=py312h526ad5a_0
- mkl=2023.1.0=h213fc3f_46344
- mkl-service=2.4.0=py312h5eee18b_1
- mkl_fft=1.3.8=py312h5eee18b_0
- mkl_random=1.2.4=py312hdb19cb5_0
- mpmath=1.3.0=py312h06a4308_0
- mysql=5.7.24=h721c034_2
- ncurses=6.4=h6a678d5_0
- nettle=3.7.3=hbbd107a_1
- networkx=3.2.1=py312h06a4308_0
- numexpr=2.8.7=py312hf827012_0
- numpy=1.26.4=py312hc5e2394_0
- numpy-base=1.26.4=py312h0da6c21_0
- openh264=2.1.1=h4ff587b_0
- openjpeg=2.4.0=h3ad879b_0
- openssl=3.0.14=h5eee18b_0
- packaging=23.2=py312h06a4308_0
- pandas=2.2.2=py312h526ad5a_0
- pcre2=10.42=hebb0a14_1
- pillow=10.3.0=py312h5eee18b_0
- pip=24.0=py312h06a4308_0
- ply=3.11=py312h06a4308_1
- pyparsing=3.0.9=py312h06a4308_0
- pyqt=5.15.10=py312h6a678d5_0
- pyqt5-sip=12.13.0=py312h5eee18b_0
- pysocks=1.7.1=py312h06a4308_0
- python=3.12.4=h5148396_1
- python-dateutil=2.9.0post0=py312h06a4308_2
- python-tzdata=2023.3=pyhd3eb1b0_0
- pytorch=2.3.1=py3.12_cuda12.1_cudnn8.9.2_0
- pytorch-cuda=12.1=ha16c6d3_5
- pytorch-mutex=1.0=cuda
- pytz=2024.1=py312h06a4308_0
- pyyaml=6.0.1=py312h5eee18b_0
- qt-main=5.15.2=h53bd1ea_10
- readline=8.2=h5eee18b_0
- requests=2.32.2=py312h06a4308_0
- setuptools=69.5.1=py312h06a4308_0
- sip=6.7.12=py312h6a678d5_0
- six=1.16.0=pyhd3eb1b0_1
- sqlite=3.45.3=h5eee18b_0
- sympy=1.12=py312h06a4308_0
- tbb=2021.8.0=hdb19cb5_0
- tk=8.6.14=h39e8969_0
- torchaudio=2.3.1=py312_cu121
- torchvision=0.18.1=py312_cu121
- tornado=6.4.1=py312h5eee18b_0
- typing_extensions=4.11.0=py312h06a4308_0
- tzdata=2024a=h04d1e81_0
- unicodedata2=15.1.0=py312h5eee18b_0
- urllib3=2.2.2=py312h06a4308_0
- wheel=0.43.0=py312h06a4308_0
- xz=5.4.6=h5eee18b_1
- yaml=0.2.5=h7b6447c_0
- zlib=1.2.13=h5eee18b_1
- zstd=1.5.5=hc292b87_2
- pip:
- tqdm==4.66.4
- einops==0.8.0
Binary file added figures/main.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions rawformer/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from .consts import CONFIG_NAME, ROOT_DATA, ROOT_OUTDIR
from .utils.funcs import join_dicts
from .train.train import train
59 changes: 59 additions & 0 deletions rawformer/base/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
Copyright (c) 2021-2023, The LS4GAN Project Developers
Copyright (c) 2017, Jun-Yan Zhu and Taesung Park
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


--------------------------- LICENSE FOR pix2pix --------------------------------
BSD License

For pix2pix software
Copyright (c) 2016, Phillip Isola and Jun-Yan Zhu
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

----------------------------- LICENSE FOR DCGAN --------------------------------
BSD License

For dcgan.torch software

Copyright (c) 2015, Facebook, Inc. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

Neither the name Facebook nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Empty file added rawformer/base/__init__.py
Empty file.
61 changes: 61 additions & 0 deletions rawformer/base/image_pool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# LICENSE
# This file was extracted from
# https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
# Please see `rawformer/base/LICENSE` for copyright attribution and LICENSE

# pylint: disable=line-too-long

import random
import torch


class ImagePool():
"""This class implements an image buffer that stores previously generated images.
This buffer enables us to update discriminators using a history of generated images
rather than the ones produced by the latest generators.
"""

def __init__(self, pool_size):
"""Initialize the ImagePool class
Parameters:
pool_size (int) -- the size of image buffer, if pool_size=0, no buffer will be created
"""
self.pool_size = pool_size
if self.pool_size > 0: # create an empty pool
self.num_imgs = 0
self.images = []

def query(self, images):
"""Return an image from the pool.
Parameters:
images: the latest generated images from the generator
Returns images from the buffer.
By 50/100, the buffer will return input images.
By 50/100, the buffer will return images previously stored in the buffer,
and insert the current images to the buffer.
"""
if self.pool_size == 0: # if the buffer size is 0, do nothing
return images
return_images = []
for image in images:
image = torch.unsqueeze(image.data, 0)
if self.num_imgs < self.pool_size: # if the buffer is not full; keep inserting current images to the buffer
self.num_imgs = self.num_imgs + 1
self.images.append(image)
return_images.append(image)
else:
p = random.uniform(0, 1)
if p > 0.5: # by 50% chance, the buffer will return a previously stored image, and insert the current image into the buffer
random_id = random.randint(0, self.pool_size - 1) # randint is inclusive
tmp = self.images[random_id].clone()
self.images[random_id] = image
return_images.append(tmp)
else: # by another 50% chance, the buffer will return the current image
return_images.append(image)
return_images = torch.cat(return_images, 0) # collect all the images and return
return return_images
Loading

0 comments on commit cd00af4

Please sign in to comment.