Speech Recognition Collection
Repositories tagged with "speech-recognition"
Repositories tagged with "speech-recognition"
transformers
huggingface
โ๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. โ
whisper.cpp
ggml-org
โPort of OpenAI's Whisper model in C/C++โ
DeepSpeech
mozilla
โDeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.โ
faster-whisper
SYSTRAN
โFaster Whisper transcription with CTranslate2โ
whisperX
m-bain
โWhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)โ
leon
leon-ai
โ๐ง Leon is your open-source personal assistant.โ
FunASR
modelscope
โIndustrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.โ
kaldi
kaldi-asr
โkaldi-asr/kaldi is the official location of the Kaldi project.โ
DeepLearningExamples
NVIDIA
โState-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.โ
vosk-api
alphacep
โOffline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Nodeโ
deep-learning-drizzle
kmario23
โDrench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!โ
PaddleSpeech
PaddlePaddle
โEasy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.โ
speechbrain
speechbrain
โA PyTorch-based Speech Toolkitโ
voice-pro
abus-aikorea
โGradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.โ
openvino
openvinotoolkit
โOpenVINOโข is an open source toolkit for optimizing and deploying AI inferenceโ
espnet
espnet
โEnd-to-End Speech Processing Toolkitโ
speech_recognition
Uberi
โSpeech recognition module for Python, supporting several engines and APIs, online and offline.โ
ASRT_SpeechRecognition
nl8590687
โA Deep-Learning-Based Chinese Speech Recognition System ๅบไบๆทฑๅบฆๅญฆไน ็ไธญๆ่ฏญ้ณ่ฏๅซ็ณป็ปโ