Benchmark: chloroplast assembly from genomic data

Status

10 check points from Weber et al. "Essential guidelines for computational method benchmarking" (2018) arXiv

Define the purpose and scope of the benchmark.

automatic assembly tools extracting whole chloroplast genomes from mixed (plastid+genome) sequencing data

Include all relevant methods.

GetOrganelle, fast-plast, org-asm, NOVOPlasty, chloroExtractor, IOGA TODO: is there another one we are missing?

Select (or design) representative datasets.

We plan to use simulated data (at different chloro:genome ratios) and real datasets with existing reference chloroplasts TODO: select exact list of chloros TODO: produce simulated datasets

Choose appropriate parameter values and software versions.

Latest version of each (wrapped into a docker container), default parameters as possible TODO: update all docker containers TODO: select default parameters for each tool

Evaluate and rank methods according to key quantitative performance metrics.

We currently only have qualitative metrics (success, failure, incomplete, ...) TODO: design quantitative metrics (reference guided: completeness, continuity, correctness) TODO: write script to gather these metrics from output

Evaluate secondary measures including runtimes and computational requirements,user-friendliness, code quality, and documentation quality.

we have a script to track all performence metrics with docker TODO: separate performence benchmarking runs with docker TODO: find objective (as objective as possible) measures for requirements, user-friendliness, code quality and documentation TODO: assign these metrics to all tools

Interpret results and provide guidelines or recommendations from both user and method developer perspectives.

in addition to pure metrics keep an eye on complementarity, maybe recommend ensemble methods TODO

Publish and distribute results in an accessible format.

GitHub, zenodo, DockerHub, biorXiv, BMC TODO

Design the benchmark to enable future extensions.

We have that with the docker setup and having scripted everything TODO: documentation on GitHub on how to reproduce the benchmarking (incl. extension)

Follow reproducible research best practices, in particular by making all code and data publicly available.

Already covered with all previous points

Name		Name	Last commit message	Last commit date
Latest commit History 353 Commits
UX_evaluation		UX_evaluation
code		code
data/real		data/real
docker		docker
manuscript		manuscript
novels		novels
performance		performance
.zenodo.json		.zenodo.json
03_representative_datasets.md		03_representative_datasets.md
LICENSE		LICENSE
README.md		README.md
latexmkrc		latexmkrc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmark: chloroplast assembly from genomic data

Status

About

Releases 22

Packages

Contributors 4

Languages

License

chloroExtractorTeam/benchmark

Folders and files

Latest commit

History

Repository files navigation

Benchmark: chloroplast assembly from genomic data

Status

About

Resources

License

Stars

Watchers

Forks

Releases 22

Packages 0

Contributors 4

Languages

Packages