x4nth055 / emotion-recognition-using-speech

Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras

  • Public
  • 244 runs
  • GitHub
  • License

Speech Emotion Recognition

Introduction

  • This repository handles building and training Speech Emotion Recognition System.
  • The basic idea behind this tool is to build and train/test a suited machine learning ( as well as deep learning ) algorithm that could recognize and detects human emotions from speech.
  • This is useful for many industry fields such as making product recommendations, affective computing, etc.
  • Check this tutorial for more information.

Emotions available

There are 3 emotions available: “neutral”, “happy” “sad”.

Feature Extraction

Feature extraction is the main part of the speech emotion recognition system. It is basically accomplished by changing the speech waveform to a form of parametric representation at a relatively lesser data rate.

In this repository, we have used the most used features that are available in librosa library including: - MFCC - Chromagram - MEL Spectrogram Frequency (mel) - Contrast - Tonnetz (tonal centroid features)

Output:

{'happy': 0.8502438, 'sad': 1.15252915e-05, 'neutral': 8.986728e-05}

Algorithms Used

This repository can be used to build machine learning classifiers as well as regressors

Classifiers/regressors:

  • SVC
  • RandomForestClassifier
  • GradientBoostingClassifier
  • KNeighborsClassifier
  • MLPClassifier
  • BaggingClassifier
  • Recurrent Neural Networks (Keras)