An end-to-end Python pipeline for performing sentiment analysis on audio files of call-center conversations.
Directions for installing and running the pipeline.
- pyAudioAnalysis: Refer to this suggested commit to make changes to audioSegmentation.py in the library so that the library may be properly imported.
- openSMILE: Follow directions from tutorial so that openSMILE can be run from the command line with
SMILExtract
. - Lasagne is required for ASAP with an LSTM. With a Lasagne install you should also get Theano.
- pydub
- hmmlearn
Once the above libraries are installed just clone this repo to get ASAP.
- NOTE: Unfortunately, pyAudioAnalysis doesn't support python 3.0+ so the pipeline must be run with python v2.7.
- To run the whole pipeline end-to-end run
AudioSentimentPipeline.py
. RunningAudioSentimentPipeline.py -h
will give a description of the input requirements and options. - The input to ASAP is a CSV where the first column is the name of the audio file and the last column is the label. Each row corresponds to a different input file. Currently the input CSV must be in the same directory as all of the input audio files.
- Each step of the pipeline can be run separately. Run
process_raw_data.py -h
,classify.py -h
, andlstm.py -h
for how to use each individually. - Refer to openSMILE documentation for how to change the feature extraction. Config files are in the opensmile_conf folder. Default extraction uses IS09 features. To change from full file features to sliding window features edit FrameModeFunctionals.conf.inc.
This example uses files that are not from call centers so do not contain two speakers and use fake labels just to show everything working.
python AudioSentimentPipeline.py -i data/input.csv -o outputs/ --hmm