OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling. The project is written primarily in Python, distributed under the Apache License 2.0 license, first published in 2017. It has gained significant community traction with 1,561 stars and 369 forks on GitHub. Key topics include: deep-learning, float16, language-model, mixed-precision, multi-gpu.
OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models
OpenSeq2Seq main goal is to allow researchers to most effectively explore various
sequence-to-sequence models. The efficiency is achieved by fully supporting
distributed and mixed-precision training.
OpenSeq2Seq is built using TensorFlow and provides all the necessary
building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.
Documentation and installation instructions
https://nvidia.github.io/OpenSeq2Seq/
Features
- Models for:
- Neural Machine Translation
- Automatic Speech Recognition
- Speech Synthesis
- Language Modeling
- NLP tasks (sentiment analysis)
- Data-parallel distributed training
- Multi-GPU
- Multi-node
- Mixed precision training for NVIDIA Volta/Turing GPUs
Software Requirements
- Python >= 3.5
- TensorFlow >= 1.10
- CUDA >= 9.0, cuDNN >= 7.0
- Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)
Acknowledgments
Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.
Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.
Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.
Disclaimer
This is a research project, not an official NVIDIA product.
Related resources
- Tensor2Tensor
- Neural Machine Translation (seq2seq) Tutorial
- OpenNMT
- Neural Monkey
- Sockeye
- TF-seq2seq
- Moses
Paper
If you use OpenSeq2Seq, please cite this paper
@misc{openseq2seq,
title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
year={2018},
eprint={1805.10387},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Contributors
Showing top 12 contributors by commit count.