SETUP

First you need to get setup!

REQUIREMENTS:

Windows PC ( This guide is written for a windows user, unfortunately I don't use Linux or MacOS so i can't help you there)
CUDA-Capable GPU ( which should be most modern nvidia GPUs)

0. Install CUDA Toolkit!

https://developer.nvidia.com/cuda-toolkit

1. Install Miniconda:

https://docs.anaconda.com/free/miniconda/index.html After installation, verify that you've conda available:

Open up command prompt, verify install by typing in where conda, it should show the path to the conda exe.

2. Install Git

https://git-scm.com/download/win

You'd need Git in order to clone the repo!

3. Clone this repository locally:

git clone https://github.com/blewClue215/RVM_ON_SEGMENTS.git

Move into the repo root folder

cd RVM_ON_SEGMENTS

4. Setup the python conda environment that these scripts will use!

4.1 Open Command Prompt

Make sure (base) is not active if it is then: conda deactivate
conda create --name rvm python==3.8
conda activate rvm

4.2 Install Pytorch

nvcc --version
Pytorch needs to be compiled according to the response from the above command (the cuda version)
- Cuda 12.1: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
- Cuda 11.8: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

This installs the core dependency of pytorch that this whole project requires!

Subject to changes based on this: https://pytorch.org/get-started/locally/

4.3. Once this is done make sure you're still in the project root folder (/RVM_ON_SEGMENTS)

pip install -r requirements_inference.txt

5. Install FFMPEG:

https://phoenixnap.com/kb/ffmpeg-windows We need this to split up the segments, and recombine them after

PRE-INFERENCE

Before you start inferencing you need to split up long-form videos into segments!

This makes the overall inference process more reliable as you wouldn't block the whole inference process due to one small decoding problem
Becomes less of a memory hog, it can eat up a lot of memory or even run out of memory trying to infer on a 15 minutes 8K video!
Plus you can stop the process at any time and restart the inference on remaining video segments because the inferenceCustom.py script is designed for it.

1. Drag and Drop the video:

2. Specify how long each segment should be:

The shorter the segment, the more matting pops you will see when the video is combined but it also drastically reduces the amount of memory hog and improves reliability of the whole inference process (I like to go with 15 seconds here)

3. After the process is done, confirm that the output is good!

There should be a folder named the same as the video but with a suffix of "_segments" There should be segmented mp4 here + a file_list.txt

INFERENCING

Inferencing is the act of "inferring" data from the input using the model that has been trained; in this case we want the segments to be used to infer alpha mattes from the model ( rvm_mobilenetv3.pth )

That's handled by inferenceCustom.py!

All you need to do is to:

1. Drag and Drop Segments folder

2. Decide if you want to shutdown the computer after inference is done:

I added this because sometimes the inference can run for up to 40 hours on 40 minutes of footage, so i prefer to let it shutdown itself after it's done to save power.

3. Once Inference is done, go into the "_matted" folder and look at the processed.json

Ensure it is all True ( which means all the segments were processed successfully)!

Note: At any point you can stop or restart the inference process!

Restarting is simply dragging-and-dropping the segment folder onto the script again at which point the inference will be restarted for any file that is "False" in this .json, but you might want to check out why the video segment failed by playing it in the "segments" folder before you restart the process.

POST-INFERENCING

Now that inference is done and you have matted video segments, you need to combine them together!

1. Drag and Drop Segments Matted Folder to combine the segments together

This will produce a "COMPOSITE_SEGMENTS_COMBINED.mp4" but it will not have audio.

2. Drag and Drop the original Video file to combine the matted video and original audio together!

This will produce the matted video with the audio from the original video!

So by the end of it your folder should look something like this:

Now your matted video is ready to enjoy!

Possible Problems and Solutions:

1. Frozen images after matting is complete:

For the frozen images it could be any number of reasons!

So the steps to troubleshoot is:

Go to the “segments_matted” folder and find the video segment that had frozen frames eg. Output_0001.mp4
Go to the “segments” folder and find Output_0001.mp4, play it in your media player and see if it skips/weird pixel artifacts/purple screen

If it does, then that source segment failed to encode properly when segmenting:

Resegment the video
Copy the resegmented Output_0001.mp4 and pop it into a new folder, maybe name it “FIX_ME”
Run inference on “FIX_ME” folder
Once done, copy “FIX_ME/COMPOSITE/Output_0001.mp4” back to the “segments_matted” folder

If not, try the same steps as above but without resegmenting the video.

If the above does not work, the worst case scenario is to segment and reencode to H.265:

Open up "1. DRAG AND DROP VIDEO TO SEGMENT HERE.bat"
Copy and replace everything in that file with this:

@echo off
if "%~1" == "" (
    echo Drag and drop a video file onto this batch file to split it into 1-minute segments.
    pause
    exit /b
)

set /p time="Time in seconds per segment:"

set input_file=%~1
set output_folder=%~dpn1_segments

mkdir "%output_folder%"

ffmpeg -i "%input_file%" -c:v libx265 -crf 18 -preset medium -c:a aac -b:a 128k -f segment -segment_time "%time%" -reset_timestamps 1 "%output_folder%\output_%%03d.mp4"

cd "%output_folder%"
(for %%i in (*.mp4) do @echo file '%%i') > file_list.txt

echo Video has been split into 1-minute segments and reencoded to H.265 with minimal loss.
pause

Save
Drag and drop video file on this.

WARNING: THIS WILL TAKE A LOT LONGER TO SEGMENT THE VIDEO BUT SHOULD MAKE IT WORK!

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
dataset		dataset
documentation		documentation
evaluation		evaluation
model		model
.gitignore		.gitignore
1. DRAG AND DROP VIDEO TO SEGMENT HERE.bat		1. DRAG AND DROP VIDEO TO SEGMENT HERE.bat
2.DRAG AND DROP SEGMENTED VIDEO FOLDER HERE.bat		2.DRAG AND DROP SEGMENTED VIDEO FOLDER HERE.bat
3. DRAG AND DROP SEGMENTS MATED FOLDER HERE.bat		3. DRAG AND DROP SEGMENTS MATED FOLDER HERE.bat
4. DRAG AND DROP VIDEO TO COPY AUDIO FROM HERE.bat		4. DRAG AND DROP VIDEO TO COPY AUDIO FROM HERE.bat
LICENSE		LICENSE
README.md		README.md
README_zh_Hans.md		README_zh_Hans.md
hubconf.py		hubconf.py
inference.py		inference.py
inferenceCustom.py		inferenceCustom.py
inference_speed_test.py		inference_speed_test.py
inference_utils.py		inference_utils.py
requirements_inference.txt		requirements_inference.txt
requirements_training.txt		requirements_training.txt
rvm_mobilenetv3.pth		rvm_mobilenetv3.pth
train.py		train.py
train_config.py		train_config.py
train_loss.py		train_loss.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SETUP

0. Install CUDA Toolkit!

1. Install Miniconda:

2. Install Git

3. Clone this repository locally:

4. Setup the python conda environment that these scripts will use!

5. Install FFMPEG:

PRE-INFERENCE

1. Drag and Drop the video:

2. Specify how long each segment should be:

3. After the process is done, confirm that the output is good!

INFERENCING

1. Drag and Drop Segments folder

2. Decide if you want to shutdown the computer after inference is done:

3. Once Inference is done, go into the "_matted" folder and look at the processed.json

POST-INFERENCING

1. Drag and Drop Segments Matted Folder to combine the segments together

2. Drag and Drop the original Video file to combine the matted video and original audio together!

Possible Problems and Solutions:

1. Frozen images after matting is complete:

About

Releases

Packages

Languages

License

blewClue215/RVM_ON_SEGMENTS

Folders and files

Latest commit

History

Repository files navigation

SETUP

0. Install CUDA Toolkit!

1. Install Miniconda:

2. Install Git

3. Clone this repository locally:

4. Setup the python conda environment that these scripts will use!

5. Install FFMPEG:

PRE-INFERENCE

1. Drag and Drop the video:

2. Specify how long each segment should be:

3. After the process is done, confirm that the output is good!

INFERENCING

1. Drag and Drop Segments folder

2. Decide if you want to shutdown the computer after inference is done:

3. Once Inference is done, go into the "_matted" folder and look at the processed.json

POST-INFERENCING

1. Drag and Drop Segments Matted Folder to combine the segments together

2. Drag and Drop the original Video file to combine the matted video and original audio together!

Possible Problems and Solutions:

1. Frozen images after matting is complete:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages