-
Notifications
You must be signed in to change notification settings - Fork 7
blobtools_ref
Containers and github repositories are not great places to store blast databases. As such, additional command line or downloading expertise is required for the optional Blobtools subworkflow.
Downloading a blast database is not particularly complicated, but it does require some patience and a good internet connection. Blast databases can be downloaded via a web browser (such as chrome) from NCBI's blast database website : ftp.ncbi.nlm.nih.gov/blast/db/. The most common databases are nt
(nucleotide) and those curated by refseq (they generally start with ref
).
UPHL downloads the ref_prok_rep_genomes
with the commands found in Grandeur/bin/download_blast.sh
mkdir blast_db
cd blast_db
for i in 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
do
wget --continue --show-progress "https://ftp.ncbi.nlm.nih.gov/blast/db/v5/ref_prok_rep_genomes.$i.tar.gz"
tar -xvf ref_prok_rep_genomes.$i.tar.gz
rm ref_prok_rep_genomes.$i.tar.gz
done
Then set the params.blast_db parameter to your new directory on the command line or in a config file.
params.blast_db = '/path/to/blast_db'
And be sure to set the corresponding database type on the command line or in a config file.
params.blast_db_type = "ref_prok_rep_genomes"
RefSeq releases occur in the first two weeks of odd numbered months, namely: January, March, May, July, September, November.
Another "popular" database is the "nt" database. This database used to include everything, but in 2023, NCBI separated out a prokaryotic version. To download this resource is very similar.
mkdir blast_db
cd blast_db
for i in 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 23 24 25
do
wget --continue --show-progress "https://ftp.ncbi.nlm.nih.gov/blast/db/v5/nt_prok.$i.tar.gz"
tar -xvf nt_prok.$i.tar.gz
rm nt_prok.$i.tar.gz
done
And be sure to set the corresponding database type on the command line or in a config file.
params.blast_db_type = "nt_prok"
List of prebuilt blast databases
update_blastdb.pl --showall
Downloading one of those (nt_prok in this example)
update_blastdb.pl nt_prok
More information can be found here: https://www.ncbi.nlm.nih.gov/books/NBK569850/
You can also create your own custom blast database following these instructions : https://www.ncbi.nlm.nih.gov/books/NBK569841/
A full list of pre-built blast databases can be found at https://ftp.ncbi.nlm.nih.gov/blast/db/v5/.
-
- amrfinderplus
- bbduk
- blastn
- blobtools_*
- core_genome_evaluation
- circulocov
- datasets_*
- drprg
- elgato
- emmtyper
- fastani
- fastp
- fastqc
- heatcluster
- iqtree2
- kaptive
- kleborate
- kraken2
- mash_*
- mashtree
- mlst
- multiqc
- mykrobe
- panaroo
- pbptyper
- phytreeviz
- plasmidfinder
- prokka
- quast
- seqsero2
- serotypefinder
- shigatyper
- snp_dists
- spades