Speech Emotion Recognition

Built two CNN models using TensorFlow to classify voice signals into one of six possible emotions using two different feature spaces (2D convolution model using Mel Spectrograms and 1D convolution model using extracted features based on power and ZCR of time domain signal).
Built a joint model combining the two previous models which yielded better results than both.
Achieved 50% accuracy and 0.48 F1-Score.
Learned about hyperparameter tuning and optimization.

Provide feedback