HomeExplorespeech-recognition

Speech Recognition Collection

Repositories tagged with "speech-recognition"

LEGENDARY

⭐161.1kHP

◆

🔮Psychic

★★★★★

transformers

huggingface

Pythonaudiodeep-learning

“🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. ”

whisper.cpp

ggml-org

C++inferenceopenai

“Port of OpenAI's Whisper model in C/C++”

DeepSpeech

mozilla

C++deep-learningdeepspeech

“DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.”

faster-whisper

SYSTRAN

Pythondeep-learninginference

“Faster Whisper transcription with CTranslate2”

whisperX

m-bain

Pythonasrspeech

“WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)”

leon

leon-ai

TypeScriptaiai-agent

“🧠 Leon is your open-source personal assistant.”

FunASR

modelscope

Pythonasraudio

“Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.”

kaldi

kaldi-asr

Shellc-plus-pluscuda

“kaldi-asr/kaldi is the official location of the Kaldi project.”

DeepLearningExamples

NVIDIA

Jupyter Notebookcomputer-visiondeep-learning

“State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.”

vosk-api

alphacep

Jupyter Notebookandroidasr

“Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node”

deep-learning-drizzle

kmario23

HTMLartificial-intelligence-algorithmsartificial-neural-networks

“Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!”

PaddleSpeech

PaddlePaddle

Pythonasrcode-switch

“Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.”

speechbrain

“A PyTorch-based Speech Toolkit”

voice-pro

abus-aikorea

Pythonaudiobookfaster-whisper

“Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.”