Release v1.0.0: E.coli serotyping with QC module and adaptive thresholding · phac-nml/ecoli_serotyping

Major improvements:

Incorporation of Quality Control module allowing for easier results interpretation and any need for correction measure (re-sequencing, wet-lab serotyping). Unique thresholding at allele level allowing to determine if a given allele and query quality parameters (%identity and %coverage) are sufficient to resolve an antigen call unambiguously.
Cluster friendly behaviour supporting multiple instances via a .lock file preventing racing conditions and simultaneous database update via several instances
An updated database of alleles with the removal of duplicated or truncated alleles (e.g. O157 antigen)
Improved species identification resolution for highly similar non-Ecoli species such as Shigella and E.albertii. Now species identification is only done via MASH NCBI RefSeq sketch (https://gembox.cbcb.umd.edu/mash/refseq.genomes.k21s1000.msh)
Users can add new alleles to an existing allele database and make serotype predictions via custom allele database thanks to --dbpath parameter
Improved O and H antigens call rates and accuracy thanks to decoupling of %identity and %coverage thresholds for each antigen. Now global thresholds could be specified separately. This is especially important if one of the antigen genes (e.g. wzx/wzy or fliC, etc) is truncated or has low coverage
Improved adaptive O antigen calling rates if only a single O antigen candidate in preliminary BLAST results is available making accurate O antigen call even in poorly sequenced samples with minimal coverage.
Addition of mixed O antigen calls for highly similar O antigens (e.g. O17/O77)
Allele names/keys used to make antigen calls are also reported making easier troubleshooting for dubious alleles and alleles database cleaning
More detailed error messages and support for 16 high similarity O-antigens (%identity > 99%) based on the reference publication PMID: 25428893

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0: E.coli serotyping with QC module and adaptive thresholding