Skip to content

CyFeng16/MVIMP

Repository files navigation

GitHub last commit GitHub issues GitHub License Code style: black

English | 简体中文 | Español

Welcome to MVIMP 👋

The name MVIMP (Mixed Video and Image Manipulation Program) was inspired by the name GIMP (GNU Image Manipulation Program), which hope it can help more people.

I realize that training a good-performance AI model is kind of just one side of the story, make it easy to use for others is the other thing. Thus, this repository built to embrace out-of-the-box AI ability to manipulate multimedia. Last but not least, wish you have fun!

Model Input Output Parallel Colab Link
AnimeGAN Images Images True Open In Colab
AnimeGANv2 Images Images True Open In Colab
DAIN Video Video False Open In Colab
DeOldify Images Images True Open In Colab
Photo3D Images Videos True(not recommmended) Open In Colab
Waifu2x Images Images True Open In Colab

You are welcomed to discuss future features in this issue.

AnimeGANv2

Original repository: TachibanaYoshino/AnimeGANv2

The improved version of AnimeGAN, which converts landscape photos/videos(todo) to anime. The improvement directions of AnimeGANv2 mainly include the following 4 points:

  1. Solve the problem of high-frequency artifacts in the generated image.
  2. It is easy to train and directly achieve the effects in the paper.
  3. Further, reduce the number of parameters of the generator network. (generator size: 8.17 Mb), The lite version has a smaller generator model.
  4. Use new high-quality style data, which come from BD movies as much as possible.
Dependency Version
TensorFLow 1.15.2
CUDA Toolkit 10.0(tested locally) / 10.1(colab)
Python 3.6.8(3.6+)

Usage:

  1. Colab

    You can open our jupyter notebook through colab link.

  2. Local

    # Step 1: Prepare
    git clone https://github.com/CyFeng16/MVIMP.git
    cd MVIMP
    python3 preparation.py
    # Step 2: Put your photos into ./Data/Input/
    # Step 3: Infernece
    python3 inference_animeganv2.py -s {The_Style_You_Choose}
  3. Description of Parameters

    params abbr. Default Description
    --style -s Hayao The anime style you want to get.
    Style name Anime style
    Hayao Miyazaki Hayao
    Shinkai Makoto Shinkai
    Paprika Kon Satoshi

AnimeGAN

Original repository: TachibanaYoshino/AnimeGAN

This is the Open source of the paper <AnimeGAN: a novel lightweight GAN for photo animation>, which uses the GAN framwork to transform real-world photos into anime images.

Dependency Version
TensorFLow 1.15.2
CUDA Toolkit 10.0(tested locally) / 10.1(colab)
Python 3.6.8(3.6+)

Usage:

  1. Colab

    You can open our jupyter notebook through colab link.

  2. Local

    # Step 1: Prepare
    git clone https://github.com/CyFeng16/MVIMP.git
    cd MVIMP
    python3 preparation.py -f animegan 
    # Step 2: Put your photos into ./Data/Input/
    # Step 3: Infernece
    python3 inference_animegan.py

DAIN

Original repository: baowenbo/DAIN

Depth-Aware video frame INterpolation (DAIN) model explicitly detect the occlusion by exploring the depth cue. We develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones.

This method achieves SOTA performance on the Middlebury dataset. Video are provided here.

The current version of DAIN (in this repo) can smoothly run 1080p video frame insertion even on GTX-1080 GPU card, as long as you turn -hr on (see Description of Parameters below).

Dependency Version
PyTroch 1.0.0
CUDA Toolkit 9.0(colab tested)
Python 3.6.8(3.6+)
GCC 4.9(Compiling PyTorch 1.0.0 extension files (.c/.cu))

P.S. Make sure your virtual env has torch-1.0.0 and torchvision-0.2.1 with CUDA-9.0 . You can use the following command: You can find out dependencies issue at #5 and #16 .

Usage:

  1. Colab

    You can open our jupyter notebook through colab link.

  2. Local

    # Step 1: Prepare
    git clone https://github.com/CyFeng16/MVIMP.git
    cd MVIMP
    python3 preparation.py -f dain
    # Step 2: Put a single video file into ./Data/Input/
    # Step 3: Infernece
    python3 inference_dain.py -input your_input.mp4 -ts 0.5 -hr
  3. Description of Parameters

    params abbr. Default Description
    --input_video -input / The input video name.
    --time_step -ts 0.5 Set the frame multiplier.
    0.5 corresponds to 2X;
    0.25 corresponds to 4X;
    0.125 corresponds to 8X.
    --high_resolution -hr store_true Default is False(action:store_true).
    Turn it on when you handling FHD videos,
    A frame-splitting process will reduce GPU memory usage.

DeOldify

Original repository: jantic/DeOldify

DeOldify is a Deep Learning based project for colorizing and restoring old images and video!

We are now integrating the inference capabilities of the DeOldify model (both Artistic and Stable, no Video) with our MVIMP repository, and keeping the input and output interfaces consistent.

Dependency Version
PyTroch 1.5.0
CUDA Toolkit 10.1(tested locally/colab)
Python 3.6.8(3.6+)

Other Python dependencies listed in colab_requirements.txt, and will be auto installed while running preparation.py.

Usage:

  1. Colab

    You can open our jupyter notebook through colab link.

  2. Local

    # Step 1: Prepare
    git clone https://github.com/CyFeng16/MVIMP.git
    cd MVIMP
    python3 preparation.py -f deoldify
    # Step 2: Infernece
    python3 -W ignore inference_deoldify.py -art
  3. Description of Parameters

    params abbr. Default Description
    --artistic -art store_true The artistic model achieves the highest quality results in image coloration,
    in terms of interesting details and vibrance.
    --stable -st store_true Stable model achieves the best results with landscapes and portraits.
    --render_factor -factor 35 Between 7 and 40, try more times for better performance.
    --watermarked -mark store_true I respect the spirit of the original author adding a watermark to distinguish AI works,
    but setting it to False may be more convenient for use in a production environment.

Photo3D

Original repository: vt-vl-lab/3d-photo-inpainting

The method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view.

Dependency Version
PyTroch 1.5.0
CUDA Toolkit 10.1(tested locally/colab)
Python 3.6.8(3.6+)

Other Python dependencies listed in requirements.txt, and will be auto installed while running preparation.py.

Usage:

  1. Colab

    You can open our jupyter notebook through colab link.

    P.S. Massive memory is occupied during operation(grows with -l).

    Higher memory runtime helps if you are Colab Pro user.

  2. Local

    # Step 1: Prepare
    git clone https://github.com/CyFeng16/MVIMP.git
    cd MVIMP
    python3 preparation.py -f photo3d
    # Step 2: Put your photos into ./Data/Input/
    # Step 3: Infernece
    python3 inference_photo3d.py -f 40 -n 240 -l 960
  3. Description of Parameters

    params abbr. Default Description
    --fps -f 40 The FPS of output video.
    --frames -n 240 The number of frames of output video.
    --longer_side_len -l 960 The longer side of output video(either height or width).

Waifu2x

Original repository: nihui/waifu2x-ncnn-vulkan

waifu2x-ncnn-vulkan is a ncnn implementation of waifu2x, which could runs fast on Intel/AMD/Nvidia with Vulkan API.

We are now integrating the inference capabilities of the waifu2x model ("cunet", "photo" and "animeart") with our MVIMP repository, and keeping the input and output interfaces consistent.

Dependency Version
CUDA Toolkit 10.1(tested locally/colab)
Python 3.6.8(3.6+)

Usage:

  1. Colab

    You can open our jupyter notebook through colab link.

  2. Local

    # Step 1: Prepare
    git clone https://github.com/CyFeng16/MVIMP.git
    cd MVIMP
    python3 preparation.py -f waifu2x-vulkan
    # Step 2: Infernece
    python3 inference_waifu2x-vulkan.py -s 2 -n 0
  3. Description of Parameters

    params abbr. Default Description
    --scale -s 2 upscale ratio (1/2, default=2)
    --noise -n 0 denoise level (-1/0/1/2/3, default=0)
    --tilesize -t 400 Tile size. Between 32 and 19327352831, no appreciable effect.
    --model -m cunet Model to use. You can choose in "cunet", "photo" and "animeart".
    --tta -x store_true
    (True if set)
    TTA mode able to reduce several types of artifacts but it's 8x slower than the non-TTA mode.
    See for details.

TODO

Acknowledgment

This code is based on the TachibanaYoshino/AnimeGAN, TachibanaYoshino/AnimeGANv2, vt-vl-lab/3d-photo-inpainting, baowenbo/DAIN, jantic/DeOldify and nihui/waifu2x-ncnn-vulkan. Thanks to the contributors of those project.

@EtianAM provides our Spanish guide. @BrokenSilence improves DAIN's performance.

Stargazers over time

Stargazers over time