Mfcc Collection
Repositories tagged with "mfcc"
Repositories tagged with "mfcc"
numpy-ml
ddbourgin
โMachine learning, in numpyโ
aubio
aubio
โa library for audio and music analysisโ
audioFlux
libAudioFlux
โA library for audio and music analysis, feature extraction.โ
emotion-recognition-using-speech
x4nth055
โBuilding and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Kerasโ
NWaves
ar1st0crat
โ.NET DSP library with a lot of audio processing functionsโ
spafe
SuperKogito
โ:sound: spafe: Simplified Python Audio Features Extractionโ
Gist
adamstark
โA C++ Library for Audio Analysisโ
Speech_Signal_Processing_and_Classification
gionanide
โFront-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].โ
SPTK
sp-nitech
โA suite of speech signal processing toolsโ
LibrosaCpp
ewan-xu
โLibrosaCpp is a c++ implemention of librosa to compute short-time fourier transform coefficients,mel spectrogram or mfccโ
pyAudioProcessing
jsingh811
โAudio feature extraction and classificationโ
Voice-based-gender-recognition
SuperKogito
โ:sound: :boy: :girl:Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)โ
kaldifeat
csukuangfj
โKaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python APIโ
diffsptk
sp-nitech
โA differentiable version of SPTKโ
MevonAI-Speech-Emotion-Recognition
SuyashMore
โIdentify the emotion of multiple speakers in an Audio Segmentโ
subsync
tympanix
โSynchronize your subtitles using machine learningโ
speech-emotion-recognition
amanbasu
โDetecting emotions using MFCC features of human speech using Deep Learningโ
AcousticKeyBoard-Web
ZhuoZhuoCrayon
โโๅฃฐๅญฆ้ฎ็๏ฝ่ๆดๅคงๅผ๏ผๅไธไธช่ฝๅฌๆ้ฎ็ๆฒๅป้ฎไฝ็ใ็ฉๅ ทใ๏ผๅญฆไน ไฟกๅทๅค็ / ๆทฑๅบฆๅญฆไน / ๅฎๅ / Djangoใโ
