x4nth055 / emotion-recognition-using-speech

Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras

  • Public
  • 244 runs
  • GitHub
  • License

😵 Uh oh! This model can't be run on Replicate because it was built with a version of Cog that is no longer supported. Consider opening an issue on the model's GitHub repository to see if it can be updated to use a recent version of Cog. If you need any help, please hop into our Discord channel or Contact us about it.

Run time and cost

This model runs on CPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Speech Emotion Recognition

Introduction

  • This repository handles building and training Speech Emotion Recognition System.
  • The basic idea behind this tool is to build and train/test a suited machine learning ( as well as deep learning ) algorithm that could recognize and detects human emotions from speech.
  • This is useful for many industry fields such as making product recommendations, affective computing, etc.
  • Check this tutorial for more information.

Emotions available

There are 3 emotions available: “neutral”, “happy” “sad”.

Feature Extraction

Feature extraction is the main part of the speech emotion recognition system. It is basically accomplished by changing the speech waveform to a form of parametric representation at a relatively lesser data rate.

In this repository, we have used the most used features that are available in librosa library including: - MFCC - Chromagram - MEL Spectrogram Frequency (mel) - Contrast - Tonnetz (tonal centroid features)

Output:

{'happy': 0.8502438, 'sad': 1.15252915e-05, 'neutral': 8.986728e-05}

Algorithms Used

This repository can be used to build machine learning classifiers as well as regressors

Classifiers/regressors:

  • SVC
  • RandomForestClassifier
  • GradientBoostingClassifier
  • KNeighborsClassifier
  • MLPClassifier
  • BaggingClassifier
  • Recurrent Neural Networks (Keras)