Gitpedia

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

From NVIDIA-NeMo·Updated May 30, 2026·View on GitHub·

Checkout our [HuggingFace🤗 collection](https://huggingface.co/collections/nvidia/nemotron-speech) for the latest open weight checkpoints and demos! The project is written primarily in Python, distributed under the Apache License 2.0 license, first published in 2019. It has gained significant community traction with 17,274 stars and 3,419 forks on GitHub. Key topics include: asr, deeplearning, generative-ai, machine-translation, neural-networks.

Latest release: v2.7.3NVIDIA Neural Modules 2.7.3
April 23, 2026View Changelog →

Project Status: Active -- The project has reached a stable, usable state and is being actively developed.
Documentation
CodeQL
NeMo core license and license for collections in this repo
Release version
Python version
PyPi total downloads
Code style: black

NVIDIA NeMo Speech

Checkout our HuggingFace🤗 collection for the latest open
weight checkpoints and demos!

Updates

The first release of NeMo Speech after NeMo repository split is scheduled for June 2026, as the repo undergoes transformation.
For the latest stable released version, please use the 26.02 NGC container.

  • 2026-04: Parakeet-unified-en-0.6b has been released with high-quality offline and streaming (with a minimum latency of 160ms) inference in one model for English language with punctuation and capitalization support.
  • 2026-03: Nemotron 3 VoiceChat is now released in Early Access. Built on the Nemotron Nano v2 LLM backbone with Nemotron speech and TTS decoder, VoiceChat delivers full-duplex, natural, interruptible conversations with low latency. Try out the demo and apply for early access.
  • 2026-03: Nemotron-Speech-Streaming v2603 has been
    updated. It has been trained on a larger and more diverse corpus, resulting in lower WER across all latency modes.
    Try out the demo and check out
    the NIM.
  • 2026-03: MagpieTTS v2602 has been released with support
    for 9 languages(En, Es, De, Fr, Vi, It, Zh, Hi, Ja). Try out
    the demo and check out
    the NIM.
  • 2026-01: Nemotron-Speech-Streaming was released: One checkpoint that enables users to pick their optimal point
    on the latency-accuracy Pareto curve!
  • 2026-01: MagpieTTS was released.
  • 2026: This repo has pivoted to focus on audio, speech, and multimodal LLM. For the last NeMo release with support for more
    modalities, see v2.7.0
  • 2025-08: Parakeet V3 and
    Canary V2 have been released with speech recognition and translation
    support for 25 European languages.
  • 2025-06: Canary-Qwen-2.5B has been released with record-setting
    5.63% WER on English Open ASR Leaderboard.

Introduction

NVIDIA NeMo Speech is built for researchers and PyTorch developers working on Speech models including Automatic Speech
Recognition (ASR), Text to Speech (TTS), and Speech LLMs. It is designed to help you efficiently create, customize, and
deploy new AI models by leveraging existing code and pre-trained model checkpoints.

For technical documentation, please see the
NeMo Framework User Guide.

Requirements

  • Python 3.12 or above
  • Pytorch 2.6 or above
  • NVIDIA GPU (if you intend to do model training)

As of Pytorch 2.6,
torch.load defaults to using weights_only=True. Some model checkpoints may require using weights_only=False.
In this case, you can set the env var TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=1 before running code that uses torch.load.
However, this should only be done with trusted files. Loading files from untrusted sources with more than weights only
can have the risk of arbitrary code execution.

Developer Documentation

VersionStatusDescription
LatestDocumentation StatusDocumentation of the latest (i.e. main) branch.
StableDocumentation StatusDocumentation of the stable (i.e. most recent release) - To be added

Install NeMo Speech

NeMo Speech is installable via pip: pip install 'nemo-toolkit[all]'
To install with extra dependencies for CUDA 12.x or 13.x, use pip install 'nemo-toolkit[all,cu12]'
or pip install 'nemo-toolkit[all,cu13]' respectively.

Contribute to NeMo

We welcome community contributions! Please refer to
CONTRIBUTING.md for the process.

Licenses

NeMo is licensed under the Apache License 2.0.

Contributors

Showing top 12 contributors by commit count.

View all contributors on GitHub →

This article is auto-generated from NVIDIA-NeMo/NeMo via the GitHub API.Last fetched: 5/31/2026