iamhectorotero / LipReading Public

Notifications You must be signed in to change notification settings
Fork 2
Star 2

Video captioning using only video frames or audio.

2 stars 2 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.gitignore		.gitignore
3d_convolution.py		3d_convolution.py
README.md		README.md
audio_file_nn.py		audio_file_nn.py
audio_plot_nn.py		audio_plot_nn.py
extract_audio_files.py		extract_audio_files.py
flattened_video_rnn.py		flattened_video_rnn.py
generate_mouth_coordinates.py		generate_mouth_coordinates.py
mouth_nn.py		mouth_nn.py
plot_bounding_box.py		plot_bounding_box.py
process_images.py		process_images.py
shape_predictor_68_face_landmarks.dat		shape_predictor_68_face_landmarks.dat
video_autoencoder_nn.py		video_autoencoder_nn.py

Repository files navigation

LipReading

Neural Network project to caption videos obtained from OuluVS database.

File Description

3d_convolution.py: Predict video class using 3D Convolution directly on the videos.
flattened_video_rnn.py: Predict video class by using a sequence of flattened images.
audio_file_nn.py: Predict video class using the audio files.
audio_plot_nn.py: Predict video class using the plot of the audio files.
extract_audio_files.py: Takes a list of videos and extracts mono audio (WAV) files.
plot_bounding_box.py: Read the values in the CSV and plot the boxes with matplotlib.
generate_mouth_coordinates.py: Generate the CSV file for the mouth bounding box.
process_images.py: Methods to reduce the resolution or properties of images.

Requirements to generate box_coordinates.csv

Dlib
ffmpeg
Numpy
Matplotlib

About

Video captioning using only video frames or audio.

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2

Languages

Python 100.0%