10 check points from Weber et al. "Essential guidelines for computational method benchmarking" (2018) arXiv
- Define the purpose and scope of the benchmark.
automatic assembly tools extracting whole chloroplast genomes from mixed (plastid+genome) sequencing data
- Include all relevant methods.
GetOrganelle, fast-plast, org-asm, NOVOPlasty, chloroExtractor, IOGA TODO: is there another one we are missing?
- Select (or design) representative datasets.
We plan to use simulated data (at different chloro:genome ratios) and real datasets with existing reference chloroplasts TODO: select exact list of chloros TODO: produce simulated datasets
- Choose appropriate parameter values and software versions.
Latest version of each (wrapped into a docker container), default parameters as possible TODO: update all docker containers TODO: select default parameters for each tool
- Evaluate and rank methods according to key quantitative performance metrics.
We currently only have qualitative metrics (success, failure, incomplete, ...) TODO: design quantitative metrics (reference guided: completeness, continuity, correctness) TODO: write script to gather these metrics from output
- Evaluate secondary measures including runtimes and computational requirements,user-friendliness, code quality, and documentation quality.
we have a script to track all performence metrics with docker TODO: separate performence benchmarking runs with docker TODO: find objective (as objective as possible) measures for requirements, user-friendliness, code quality and documentation TODO: assign these metrics to all tools
- Interpret results and provide guidelines or recommendations from both user and method developer perspectives.
in addition to pure metrics keep an eye on complementarity, maybe recommend ensemble methods TODO
- Publish and distribute results in an accessible format.
GitHub, zenodo, DockerHub, biorXiv, BMC TODO
- Design the benchmark to enable future extensions.
We have that with the docker setup and having scripted everything TODO: documentation on GitHub on how to reproduce the benchmarking (incl. extension)
- Follow reproducible research best practices, in particular by making all code and data publicly available.
Already covered with all previous points