GitPedia

Bactmap

A mapping-based pipeline for creating a phylogeny from bacterial whole genome sequences

From nf-core·Updated May 22, 2026·View on GitHub·

**nf-core/bactmap** is a bioinformatics best-practice analysis pipeline for mapping short (Illumina) and long reads (Oxford Nanopore) from bacterial WGS to a reference sequence, creating filtered VCF files and making pseudogenomes based on high quality positions in the VCF files. The project is written primarily in Nextflow, distributed under the MIT License license, first published in 2019. Key topics include: bacteria, bacterial, bacterial-genome-analysis, genomics, mapping.

Latest release: 1.0.0nf-core/bactmap v1.0.0 - Aluminium Spider
June 18, 2021View Changelog →
<h1> <picture> <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-bactmap_logo_dark.png"> <img alt="nf-core/bactmap" src="docs/images/nf-core-bactmap_logo_light.png"> </picture> </h1>

GitHub Actions CI Status
GitHub Actions Linting StatusAWS CICite with Zenodo
nf-test

Nextflow
run with conda
run with docker
run with singularity
Launch on Seqera Platform

Get help on SlackFollow on TwitterFollow on MastodonWatch on YouTube

Introduction

nf-core/bactmap is a bioinformatics best-practice analysis pipeline for mapping short (Illumina) and long reads (Oxford Nanopore) from bacterial WGS to a reference sequence, creating filtered VCF files and making pseudogenomes based on high quality positions in the VCF files.

Pipeline summary

  1. Index reference fasta file (short-read: BWA index or Bowtie2 build; long-read: minimap2 index)
  2. Read QC (FastQC or falco as an alternative option)
  3. Calculate fastq summary statistics (fastq-scan)
  4. Perform read pre-processing (optional)
  5. Downsample fastq files (optional) (Rasusa)
  6. Summarise read statistics pre- and post-processing and subsampling (read_stats)
  7. Variant calling
  1. Create alignment from pseudogenomes by concatenating fasta files having first checked that the sample sequences are high quality (alignpseudogenomes)
  2. Extract variant sites from alignment (SNP-sites)
  3. Present QC for raw and processed reads, alignment statistics and variant statistics (MultiQC)

Usage

[!NOTE]
If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

csv
sample,run_accession,instrument_platform,fastq_1,fastq_2 2612,run1,ILLUMINA,2612_run1_R1.fq.gz, 2613,run1,ILLUMINA,2612_run3_R1.fq.gz,2612_run3_R2.fq.gz 2614,run3,OXFORD_NANOPORE,2614_file1.fastq.gz, 2614,run3,OXFORD_NANOPORE,2614_file2.fastq.gz,

Each row represents a fastq file (single-end) or a pair of fastq files (paired end), either Illumina (short reads) or Oxford Nanopore (long reads).

Additionally, if you are analysing Oxford Nanopore data, you will need to provide the path to a model to use with Clair3 (specified with --clair3_model). Models for older chemistries and basecallers (e.g. r9.4.1) can be downloaded from here. For newer chemistries and basecallers, ONT provides models through Rerio. To download the models for Clair3 from the ONT github, you can use the following commands (each model will be downloaded to the folder clair3_models/<clair3_model_name>):

bash
# Clone the rerio repository git clone https://github.com/nanoporetech/rerio # Download all models python3 download_model.py --clair3

Now, you can run the pipeline using:

bash
nextflow run nf-core/bactmap \ -profile <docker/singularity/.../institute> \ --input samplesheet.csv \ --fasta <REFERENCE_FASTA> \ --clair3_model <PATH_TO_CLAIR3_MODEL> \ --outdir <OUTDIR>

[!WARNING]
Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline output

To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page.
For more details about the output files and reports, please refer to the
output documentation.

Credits

nf-core/bactmap was originally written by Anthony Underwood, Andries van Tonder and Thanh Le Viet.

We thank the following people for their extensive assistance in the development
of this pipeline:

Anthony Underwood's time working on the project was funded by the National Institute for Health Research(NIHR) Global Health Research Unit for the Surveillance of Antimicrobial Resistance (Grant Reference Number 16/136/111)
NIHR funded

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #bactmap channel (you can join with this invite).

Citations

<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. --> <!-- If you use nf-core/bactmap for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Contributors

Showing top 8 contributors by commit count.

View all contributors on GitHub →

This article is auto-generated from nf-core/bactmap via the GitHub API.Last fetched: 6/17/2026