GSoC 2021 project with Red Hen Lab. I want to improve the clustering algorithm to the in-production code and enhance the previous work. The main problem is to find the correct anchor. Currently, the most time-consuming process in the program is going frame by frame and extracting faces. The face-recognition method used in the production code processes each frame individually. I want to upgrade it to a parallelized algorithm to process multiple frames in a batch and increase the processing speed exponentially, which will also help in faster testing of hyperparameters. Batch processing can be much quicker than processing single images at a time. So I think we use batch processing for multi-threading. More information about the project statement can be found here.
Mentors: Anna Wilson, Francis Steen, Frankie Robertson
Blog detailing the research and working can be found at edoates84.github.io
- Clone the repo to your machine
git clone https://github.com/EdOates84/Show-Segmentation-2021.git
- Install the required python packages using either of these commands
pip install numpy pandas matplotlib opencv-python scikit-learn face_recognition wikipedia ffmpeg traceback2
- Download the anchors-encodings pickle and place it in this location.
/Show-Segmentation-2021/final_celeb_detection/final_pickles/anchors-with-TV-encodings.pickle
- Download the IMDB Datasets and place it in this location.
Show-Segmentation-2021/IMDB_Datasets/name.basics.tsv
Show-Segmentation-2021/IMDB_Datasets/title.basics.tsv
Show-Segmentation-2021/IMDB_Datasets/title.principals.tsv
Show-Segmentation-2021/IMDB_Datasets/name.akas.tsv
Show-Segmentation-2021/IMDB_Datasets/name.crew.tsv
- Navigate to Show-Segmentation-2021/final_usable_code/
cd Show-Segmentation-2021/final_usable_code
- segment_tv.py takes 3 inputs, the path to the input video, path to the output location and a flag --verbose.
python3 segment_tv.py path/to/input/video.mp4 path/to/store/output --verbose
- Make sure that the input video's name follows RedHenLab's Tv dataset's format. Here's an example
1980-06-03_0000_US_00020088_V0_U2_M9_EG1_DB.mp4
**This is for those using the singularity image (segmentation_production.simg) on the CWRU HPC Cluster.
- Connect to the CWR VPN.
- Login to the cluster using your CWR ID and your credentials. Example:
- Navigate to the project's location on the cluster.
cd /mnt/rds/redhen/gallina/Singularity/Show-Segmentation-2021/final_usable_code
- Request a computing node using
srun --mem=16gb --pty /bin/bash
- Load singularity 2.5.1 to your environment using
module load singularity
- I have made segment_tv.py for testing and final_script.py for final production. After setup, read the Testing section or the Production section according to the requirement.
- segment_tv.py is made to work on a single video file. It takes 3 inputs (in this order)
path/to/input/video.mp4
path/to/output/directory
(where the output will be stored)--verbose
(an optional flag which will make the program print progress statements like 'done extracting faces', 'done clustering faces' etc.)
- The main command is of the form
singularity exec -B /mnt../show-segementation-2021_latest.sif python3 segment_video.py {INPUT_VIDEO_PATH} {OUTPUT_PATH} {--verbose}
- The Tv dataset is present at
/mnt/rds/redhen/gallina/tv
, we can take some video file from this as our input. - An example command for the file
1998-01/1998-01-01/1998-01-01_0000_US_00019495_V3_VHS50_MB20_H17_WR.mp4
is
singularity exec -B /mnt../show-segementation-2021_latest.sif python3 segment_video.py /mnt/rds/redhen/gallina/tv/1998/1998-01/1998-01-01/1998-01-01_0000_US_00019495_V3_VHS50_MB20_H17_WR.mp4 mnt/path/to/output/directory --verbose
- final_script.py is made to work recursively on all the video files present in
/mnt/rds/redhen/gallina/tv/
and store the outputs in/mnt/rds/redhen/gallina/Singularity/Show-Segmentation-2021/TvSplit_2
--verbose
flag mentioned earlier is set to False by default for production.- Run the script using
python3 final_script.py
Please raise an issue if you run into any errors.
-
If possible, replace the current celeb detection method with Azure’s Computer Vision service.
-
Currently the most time consuming process in the program is that of going frame by frame and extracting faces. This can be speed up using multi-threading or any other means possible.
-
Explore the dataset that contains news shows with anchors and include it in the pipeline.
-
Shows with single or a few episodes can just be dropped to improve precision at the expense of recall.