-
Notifications
You must be signed in to change notification settings - Fork 7
datasets
Young edited this page Feb 23, 2024
·
2 revisions
Datasets
About: |
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
Documentation : https://www.ncbi.nlm.nih.gov/datasets/docs/v2/getting_started/
Citation :
Directory tree (individual species files may change):
datasets
├── datasets_summary.csv
├── Pseudomonas_genomes.csv
├── species_list.txt
├── Stenotrophomonas_maltophilia_genomes.csv
└── Stenotrophomonas_sp_genomes.csv
Example file for an individual species (species_genomes.csv)
accession,assminfo-refseq-category,assminfo-level,organism-name,assmstats-total-ungapped-len
GCF_002906475.1,representative genome,Complete Genome,Vibrio campbellii,5425575
GCF_002906475.1,representative genome,Complete Genome,Vibrio campbellii,5425575
GCF_000772655.1,,Contig,Vibrio campbellii,5652224
GCF_000818235.1,,Contig,Vibrio campbellii,5777711
GCF_000818295.1,,Scaffold,Vibrio campbellii,5736147
GCF_000818315.1,,Scaffold,Vibrio campbellii,5858047
Example file for a run (species_list.txt) These would be the organisms identified in the samples in the run.
Acinetobacter_nosocomialis
Acinetobacter_seifertii
Burkholderia_ambifaria
Burkholderia_cenocepacia
Burkholderia_cepacia
Burkholderia_contaminans
Burkholderia_lata
Burkholderia_metallica
Burkholderia_orbicola
Burkholderia_pyrrocinia
Burkholderia_seminalis
Burkholderia_stabilis
Burkholderia_territorii
Cronobacter_dublinensis
Cronobacter_muytjensii
Cronobacter_sakazakii
Cronobacter_universalis
Escherichia_albertii
Escherichia_fergusonii
Escherichia_marmotae
Escherichia_ruysiae
Klebsiella_grimontii
Klebsiella_michiganensis
Klebsiella_oxytoca
Klebsiella_pasteurii
Klebsiella_quasipneumoniae
Klebsiella_quasivariicola
Klebsiella_variicola
Ralstonia_wenshanensis
Shigella_dysenteriae
Shigella_sonnei
Shinella_zoogloeoides
Streptococcus_bouchesdurhonensis
Streptococcus_chosunense
Streptococcus_constellatus
Streptococcus_dysgalactiae
Streptococcus_gwangjuense
Streptococcus_hominis
Streptococcus_humanilactis
Streptococcus_intermedius
Streptococcus_mitis
Streptococcus_periodonticum
Streptococcus_pneumoniae
Streptococcus_pseudopneumoniae
Streptococcus_symci
Streptococcus_thalassemiae
Streptococcus_toyakuensis
Streptococcus_vaginalis
Vibrio_campbellii
Vibrio_cholerae
Vibrio_floridensis
Vibrio_fluvialis
Vibrio_harveyi
Vibrio_mimicus
Vibrio_owensii
Vibrio_paracholerae
Vibrio_parahaemolyticus
Vibrio_rotiferianus
Vibrio_vulnificus
Example file for a run (datasets_summary.csv) These would be the genomes downloaded from NCBI to be used as fastani references.
accession,assminfo-refseq-category,assminfo-level,organism-name,assmstats-total-ungapped-len
GCA_004124255.1,,Complete Genome,Candidatus Pseudomonas adelgestsugas,1835598
GCF_000006765.1,reference genome,Complete Genome,Pseudomonas aeruginosa PAO1,6264404
GCF_000233495.1,,Scaffold,Pseudomonas aeruginosa,6341124
GCF_000632755.1,,Scaffold,Pseudomonas aeruginosa,6296878
GCF_000710625.1,,Scaffold,Pseudomonas aeruginosa,6385324
GCF_002307495.1,representative genome,Contig,Pseudomonas abyssi,4322744
GCF_004124255.1,representative genome,Complete Genome,Candidatus Pseudomonas adelgestsugas,1835598
GCF_019168305.1,representative genome,Contig,Pseudomonas aegrilactucae,5703090
GCF_900100795.1,representative genome,Contig,Pseudomonas abietaniphila,7222451
GCF_000972335.1,,Scaffold,Stenotrophomonas maltophilia,4382093
GCF_001068765.1,,Scaffold,Stenotrophomonas maltophilia,4392414
GCF_001068915.1,,Scaffold,Stenotrophomonas maltophilia,4833127
GCF_001068965.1,,Scaffold,Stenotrophomonas maltophilia,4435914
GCF_900475405.1,representative genome,Complete Genome,Stenotrophomonas maltophilia,4481118
-
- amrfinderplus
- bbduk
- blastn
- blobtools_*
- core_genome_evaluation
- circulocov
- datasets_*
- drprg
- elgato
- emmtyper
- fastani
- fastp
- fastqc
- heatcluster
- iqtree2
- kaptive
- kleborate
- kraken2
- mash_*
- mashtree
- mlst
- multiqc
- mykrobe
- panaroo
- pbptyper
- phytreeviz
- plasmidfinder
- prokka
- quast
- seqsero2
- serotypefinder
- shigatyper
- snp_dists
- spades