Eggnog mapper
Fast genome-wide functional annotation through orthology assignment
> **Warning**: v3 is currently under heavy testing and has not been officially released. > For production use, install the stable v2 release: `pip install eggnog-mapper==2.1.15` > (see [v2 branch](https://github.com/eggnogdb/eggnog-mapper/tree/v2)). The project is written primarily in Python, distributed under the GNU Affero General Public License v3.0 license, first published in 2015. Key topics include: annotations, functional-annotation, genomics, orthology-assignments.
eggNOG-mapper v3
Warning: v3 is currently under heavy testing and has not been officially released.
For production use, install the stable v2 release:pip install eggnog-mapper==2.1.15
(see v2 branch).
eggNOG-mapper is a tool for fast functional annotation of novel sequences using
precomputed orthologous groups and phylogenies from the
eggNOG database.
Functional information is transferred exclusively from fine-grained orthologs,
yielding higher precision than homology-based approaches (e.g. BLAST) by avoiding
annotation transfer from close paralogs.
Common uses include annotation of novel genomes, transcriptomes, and metagenomic
gene catalogs.
eggNOG-mapper is also available as a public web server: http://mapper.eggnogdb.org
What's new in v3
v3 is a major release targeting the eggNOG v7 database and a completely
redesigned annotation engine.
- eggNOG v7 database with integer-encoded orthology, phylogeny-aware speciation
events, and ~12M proteins across ~10k taxa. eggNOG v5 databases are no longer
supported. - Curated-only functional donors: only manually curated functional terms
(from SwissProt and equivalent curated sources) are used as annotation donors.
This stops the propagation of misannotations inherited from automated pipelines.
Despite the stricter source requirements, v3 achieves better annotation coverage
than v2. - Per-seed taxonomic ceiling replaces the old
--tax_scopepredefined
scope lists. Each query seed gets its ownev_lca-based ceiling automatically
narrowed to the most informative phylogenetic level (--tax_scope auto,
default). Fixed clades (Metazoa,33208, etc.) are still accepted. - Cascade annotation engine: for each functional source (GO, KEGG, Pfam,
EC, ...) donors are walked from closest and best-typed first, with the seed's own
curated annotation as the strongest tier-0 donor. - No bundled binaries — DIAMOND, HMMER, MMseqs2, and Prodigal must be
installed externally (see Requirements below). The wheel shrinks from ~150 MB
to ~5 MB and cross-platform installs (macOS, Windows) now work. - Compressed input — gzip and bzip2 FASTA inputs are autodetected by magic
bytes. - Parallel annotation —
--cpu Nparallelises both search and annotation. - Cython-accelerated inner loops —
_codecand_collect_innerextensions
give ~2–3× speedup on the annotation phase. --resume— safely resumes an interrupted run, reusing the existing hits
file.- Apptainer/Singularity image — a self-contained HPC image is provided via
apptainer/build.sh.
Requirements
- Python ≥ 3.9
- At least one search backend:
| Tool | Install |
|---|---|
| DIAMOND | conda install -c bioconda diamond |
| HMMER | conda install -c bioconda hmmer |
| MMseqs2 | conda install -c bioconda mmseqs2 |
| Prodigal | conda install -c bioconda prodigal (gene prediction only) |
Installation
bashpip install eggnog-mapper
Or from source:
bashgit clone https://github.com/eggnogdb/eggnog-mapper.git cd eggnog-mapper pip install .
Download the eggNOG v7 database
bashdownload_eggnog_data.py --data_dir /path/to/eggnog-data
Quick start
bash# Protein sequences against eggNOG v7 using DIAMOND emapper.py -m diamond -i proteins.fa --itype proteins \ --data_dir /path/to/eggnog-data \ -o my_annotation --output_dir results/ --cpu 20 # Two-step: search first, annotate later emapper.py -m diamond -i proteins.fa --itype proteins \ --data_dir /path/to/eggnog-data \ -o my_annotation --output_dir results/ --no_annot --cpu 20 emapper.py -m no_search --annotate_hits_table results/my_annotation.emapper.seed_orthologs \ --data_dir /path/to/eggnog-data \ -o my_annotation --output_dir results/
Documentation
https://github.com/eggnogdb/eggnog-mapper/wiki
Citation
If you use eggNOG-mapper, please cite:
[1] eggNOG-mapper v2: functional annotation, orthology assignments, and domain
prediction at the metagenomic scale. Carlos P. Cantalapiedra,
Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas. 2021.
Molecular Biology and Evolution, msab293, https://doi.org/10.1093/molbev/msab293
[2] eggNOG v7: phylogeny-based orthology predictions and functional annotations.
Ana Hernández-Plaza, Ziqi Deng, Fabian Robledo-Yagüe, Damian Szklarczyk,
Christian von Mering, Peer Bork, Jaime Huerta-Cepas. Nucleic Acids Research,
Volume 54, Issue D1, 6 January 2026, Pages D402-D408.
https://doi.org/10.1093/nar/gkaf1249
Please also cite the search tool used:
[DIAMOND] Sensitive protein alignments at tree-of-life scale using DIAMOND.
Buchfink B, Reuter K, Drost HG. 2021.
Nature Methods 18, 366–368. https://doi.org/10.1038/s41592-021-01101-x
[HMMER] Accelerated Profile HMM Searches.
Eddy SR. 2011. PLoS Comput. Biol. 7:e1002195.
[MMSEQS2] MMseqs2 enables sensitive protein sequence searching for the analysis
of massive data sets. Steinegger M & Söding J. 2017.
Nat. Biotech. 35, 1026–1028. https://doi.org/10.1038/nbt.3988
[PRODIGAL] Prodigal: prokaryotic gene recognition and translation initiation
site identification. Hyatt et al. 2010.
BMC Bioinformatics 11, 119. https://doi.org/10.1186/1471-2105-11-119
Legacy v2 (eggNOG v5)
If you are working with eggNOG v5 databases, use the
v2 branch
or install the last v2 release from PyPI:
bashpip install eggnog-mapper==2.1.15
v2 and v3 databases are not interchangeable. v3 only works with eggNOG v7.
Contributors
Showing top 12 contributors by commit count.
