Skip to content

Evaluation, Reproducibility, Benchmarks Working Group Meeting Notes

Nicholas Heller edited this page Nov 27, 2024 · 36 revisions

Every year, hundreds of new algorithms are published in the field of biomedical image analysis. While validation of new methods has long been based on private data, publically available data sets and international competitions ('challenges') meanwhile allow for benchmarking algorithms in a transparent and comparative manner. Recent research, however, has revealed several flaws related to common practice in validation. A core goal of the program is, therefore, to provide the infrastructure and tools for quality-controlled validation and benchmarking of medical image analysis methods. In collaboration with the international biomedical image analysis challenges (BIAS) initiative, open technical challenges and research questions related to a variety of topics will be addressed, ranging from best practices (e.g. How to make a trained model public? How to enable reproducibility of a training process? Which metric to use for which application?) and political aspects (e.g. How to create incentives for sharing code and data?) to performance aspects (e.g. How to report memory/compute requirements? How to support identification of performance bottlenecks in pipelines?) and implementation efficiency (e.g. How to provide baseline methods for comparative assessment?).

2024

2023

2022

2021

2020


Chairs

  • Annika Reinke
  • Carole Sudre

Secretary

  • Nicholas Heller

Group members

  • Lena Maier-Hein
  • Nicola Rieke
  • Michela Antonelli
  • M. Jorge Cardoso
  • Keyvan Farahani
  • Olivier Colliot
  • Anne Mickan

Previous Roles

  • Chair: Lena Maier-Hein
  • Secretary: Annika Reinke

Task forces

Metrics task force

Lead: Carole Sudre

Quick data access task force

Lead: Michela Antonelli

Benchmarking task force

Lead: Annika Reinke

Clone this wiki locally