kblin/ncbi-genome-download
Scripts to download genomes from the NCBI FTP servers
This is release 0.3.3 of ncbi-genome-download. This release has no new features on top of 0.3.2 but adds some information on how to cite the software. Detailed changes: ``` Kai Blin (5): README: Add citation information CITATION: Add a citation metadata file CITATION: Second attempt at generating a valid CITATION.rff file CITATION: use the correct file name for the citation metadata file Bump version to 0.3.3 ```
📋 Changes
- Add support for the new format assembly info headers, fixing downloads.
- Support for the translated-cds format (thanks @SwiftSeal)
- Allow fuzzy searches for accessions
- Cache the MD5SUMS files for a day as well, to make re-starting the
📋 Changes
- Add support for the new format assembly info headers, fixing downloads.
- Support for the translated-cds format (thanks @SwiftSeal)
- Allow fuzzy searches for accessions
- Cache the MD5SUMS files for a day as well, to make re-starting the
📋 Changes
- support for progress bars (thanks to @444thLiao)
- various bug fixes (thanks @peterjc and @jrjhealey)
📋 Changes
- `gimme_taxa.py` now is installable, thanks Istvan (@ialbert)
- We no longer break on FTP entries without an FTP path, thanks Paul (@openpaul)
- We now raise an error if you try to download metagenomes from RefSeq.
- Updated Chinese README file, thanks James (@jamesyangget)
- We no longer leak pool workers when running parallel downloads, thanks
📋 Changes
- Parallel downloads of checksum files (Thanks to Adelme Bazin (@axbazin))
- New --flat-output option to dump all downloaded files into a single directory
- We now have a Chinese translation of the README (Thanks James Yang (@jamesyangget))
This is release 0.2.11 of ncbi-genome-download which fixes two logging issues. Thanks to David Morgan (@Cptmorgan27) for providing a patch. Detailed changes: ``` David Morgan (1): core: remove print statement for type material Kai Blin (4): chore: Use a named logger instead of the root logger README: Make it clearer that more than just bacteria and viral groups are available chore: Remove landscape.io link, as that service seems dead Bump version number to 0.2.11 ```
📋 Changes
- Use realtive instead of absolute symlinks for human-readable output (thanks
- No longer crash on abnormal organism names (thanks to @andrewsanchez for the
- Allow for fuzzy matching of both organism name and accessions
This release adds the "relation to type material filter" contributed by Jason Davis-Cooke. Thanks for that. Detailed changes: ``` Jason Davis-Cooke (1): feat(core): add 'relation to type material' as as filtering option (#82) Kai Blin (2): README: Document the type material filter option Bump version number to 0.2.9 ```
This is mainly a bugfix release fixing a UnicodeEncodeError when writing to a --metadata-table file with non-ASCII entries like in record GCF_000234725.1. Thanks to @danudwary and @jananiravi for the error reports. Also thanks to Tessa Pierce and Joe Healey for their contributions. Detailed changes: ``` Joe Healey (1): update readme with conda install Kai Blin (3): config: Change a tab indent to spaces core: Open metatable file with utf-8 encoding Bump version number to 0.2.8 Tessa Pierce (1): add support for rm (repeat masked) eukaryotic genomes ```
📋 Changes
- Input options that supported a comma-separated list can now also read from
- Support for downloading files in RNA FASTA format (thanks, @bluegenes).
- Contributed script to get taxids for all children of a parent taxon (thanks
📋 Changes
- Multiple formats, taxids, species, etc. can now be downloaded as once, see
- You can now save information on downloaded files in a tab-separated table
📋 Changes
- Enable specifically downloading reference and representative genomes.
- New 'ngd' command alias saves you from having to type 'ncbi-genome-download' all the time.
📋 Changes
- Enable using ncbi-genome-download as API from your own scripts, thanks to Marc Bourqui (@mbourqui).
- Also allow downloading the _assembly_report.txt and _assembly_stats.txt files, thanks to Peter Cock (@peterjc).
- Better handle interrupting with Ctrl-C while downloading in multiple threads.
📋 Changes
- Properly deal with existing human-readable symlinks when re-running a download, thanks to An Phung (@anphung).
- Support creating human-readable symlinks without reloading the genome data
- Work around formatting issues in NCBI's viral assembly_summary.txt files
