Action Recognition in Video

This repo will serve as a playground where I investigate different approaches to solving the problem of action recognition in video.

I will mainly use the UCF-101 dataset.

Setup

$ cd data/              
$ bash download_ucf101.sh     # Downloads the UCF-101 dataset (~7.2 GB)
$ unrar x UCF101.rar          # Unrars dataset
$ unzip ucfTrainTestlist.zip  # Unzip train / test split
$ python3 extract_frames.py   # Extracts frames from the video (~26.2 GB, go grab a coffee for this)

ConvLSTM

The only approach investigated so far. Enables action recognition in video by a bi-directional LSTM operating on frame embeddings extracted by a pre-trained ResNet-152 (ImageNet).

The model is composed of:

A convolutional feature extractor (ResNet-152) which provides a latent representation of video frames
A bi-directional LSTM classifier which based on the latent representation of the video predicts the activity depicted

I have made a trained model available here.

Train

$ python3 train.py  --dataset_path data/UCF-101-frames/ \
                    --split_path data/ucfTrainTestlist \
                    --num_epochs 200 \
                    --sequence_length 40 \
                    --img_dim 112 \
                    --latent_dim 512

Test on Video

$ python3 test_on_video.py  --video_path data/UCF-101/SoccerPenalty/v_SoccerPenalty_g01_c01.avi \
                            --checkpoint_model model_checkpoints/ConvLSTM_150.pth

Results

The model reaches a classification accuracy of 91.27% accuracy on a randomly sampled test set, composed of 20% of the total amount of video sequences from UCF-101. Will re-train this model on the offical train / test splits and post results as soon as I have time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Action Recognition in Video

Setup

ConvLSTM

Train

Test on Video

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Action Recognition in Video

Setup

ConvLSTM

Train

Test on Video

Results