Skip to content

lance-ying/NHNN

Repository files navigation

NHNN Implementation

This is the Pytorch implementation for paper "Accounting for Variations in Speech Emotion Recognition with NonParametric Hierarchical Neural Network".

The training file included is for IEMOCAP. The PRIORI datasets are not public. The metadata is stored in data.csv, which includes audio segment id, subject id, gender label, and emotion label (valence rating). The eGeMAPS features for the audios are extracted and stored as features.csv.

The Log-MFB features can be downloaded here (around 7GB). The features have been extracted and concatenated into a single numpy array.

To run the training for CNN, you can simply run

python3 train_CNN.py

To run the training for NHNN, type

python3 train_NHNN.py *version*

Here, version must be either FC or FC+Conv

Please note that the current NHNN implementation requires a feature encoder. Therefore you must run the training for CNN first, and the feature encoder will be used for training the NHNN model.

Please feel free to email [email protected] for any questions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages