Before train or predict, we need a pre-process to convert the data into feature vectors.
-
Put mp4 files in
AVSPEECH_DIR
. -
Run (GPU Only)
$ docker-compose run preprocess ./run.sh
-
Put mp4 files in
AUDIOSET_DIR
. -
Run
$ docker-compose run preprocess python3 convert_audioset.py
-
Set
mode = Mode.predict
appeared inpreprocess/src/env.py
. -
Put mp4 files in
MOVIE_DIR
. -
Run (GPU Only)
$ docker-compose run preprocess ./run.sh