- Original version:
pip install tensorflow==1.9 keras==2.1.5
- Updated 2020.04.20:
pip install tensorflow==1.15.2 keras
- or
pipenv install
is you havepipenv
- Download ffmpeg from ffmpeg, you should select
Static
linking and get a zip file. - extract the zip file into
ffmpeg
folder, so that there existsffmpeg/bin/ffmeg.exe
.
- Download sox from SOund eXchange, you should get a zip file.
- extract zip file into the sox folder. so that there exists
sox/sox.exe
.
Convert recorded audio files to *.wav files
$ python ./convert_file.py <Data Folder>
The Data Folder
should contains many subfolders where your audios files reside. Typically, one of your audio file could be <Data Folder>/group1/a.mp3
.
The results of conversion are within ./data/train/
. Your should manually move some of them to ./data/test
to accomplish training-validation
separation.
The fraction of moved files depends on yourself.
The data augmentation server is implemented by grpc.
$ pip install grpcio
or for some version of python3
$ pip3 install grpcio
Training involves two files: train.py
and augmentation/
.
$ python -m augmentation
will start a augmentation server that provide train data and test data.
train.py
will connect to augmentation server and request data.
augmentation/config.py
is used for configuring the batch size/thread size/data source/...
Before training, there are several things you should do.
You have done it in Data preparation
. Now check it again.
-
put train data into
data/train/
-
put validate data into
data/test/
-
NOTE: the wav file must be encoded by 16 bit signed integer, mono-channeled and at a sampling rate of 16000.
-
You should got things correct if you obtained them from
convert_file.py
- You should got sox in
sox/
, now check it again.
server side: $python -m augmentation
- this will start an augmentation server utilizing
sox
.
client side: $python train.py
-
this will start trainig with data requested from augmentation server.
-
NOTE: run it from the folder
audioNet
** Resume a interrupted training process.
You can resume from certain checkpoint, modify the last line of train.py
, set -1
(Negtive 1) as your start point.
modify webfront.py
, change MODEL_ID
to yours.
open a web browser and input URL:http://127.0.0.1:5000/predict.
*It requires [ffmpeg](https://ffmpeg.org/)
for audio file format convertion.
** Select Checkpoint for Evaluation
modify webfront.py
, change MODEL_ID
to yours.
See Run python webfront.py
- Choose an
ID
of checkpoint by yourself frommodels/save_<ID>.h5
. - Run
$ python ./create_pb.py <ID>
. This will create filemodels/model.pb
- Place your model.pb file where you want to deploy. Typically, see Android mobile example: androidAudioRecg