Why rankings of biomedical image analysis competitions should
be interpreted with care

MAIER-HEIN, Lena, Matthias EISENMANN, Annika REINKE, Sinan ONOGUR, Marko STANKOVIC, Patrick SCHOLZ, Tal ARBEL, Hrvoje BOGUNOVIC, Andrew BRADLEY, Aaron CARASS, Carolin FELDMANN, Alejandro FRANGI, Peter FULL, Bram VAN GINNEKEN, Allan HANBURY, Katrin HONAUER, Michal KOZUBEK, Bennett LANDMAN, Keno MÄRZ, Oskar MAIER, Klaus MAIER-HEIN, Bjoern MENZE, Henning MÜLLER, Peter NEHER, Wiro NIESSEN, Nasir RAJPOOT, Gregory SHARP, Korsuk SIRINUKUNWATTANA, Stefanie SPEIDEL, Christian STOCK, Danail STOYANOV, Abdel Aziz TAHA, Fons VAN DER SOMMEN, Ching-Wei WANG, Marc-André WEBER, Guoyan ZHENG, Pierre JANNIN and Annette KOPP-SCHNEIDER. Why rankings of biomedical image analysis competitions should be interpreted with care. Nature Communications. Nature Publishing Group, 2018, vol. 9, No 5217, p. 1-13. ISSN 2041-1723. Available from: https://dx.doi.org/10.1038/s41467-018-07619-7.

Other formats: BibTeX LaTeX RIS

TY  - JOUR
ID  - 1466456
AU  - Maier-Hein, Lena - Eisenmann, Matthias - Reinke, Annika - Onogur, Sinan - Stankovic, Marko - Scholz, Patrick - Arbel, Tal - Bogunovic, Hrvoje - Bradley, Andrew - Carass, Aaron - Feldmann, Carolin - Frangi, Alejandro - Full, Peter - van Ginneken, Bram - Hanbury, Allan - Honauer, Katrin - Kozubek, Michal - Landman, Bennett - März, Keno - Maier, Oskar - Maier-Hein, Klaus - Menze, Bjoern - Müller, Henning - Neher, Peter - Niessen, Wiro - Rajpoot, Nasir - Sharp, Gregory - Sirinukunwattana, Korsuk - Speidel, Stefanie - Stock, Christian - Stoyanov, Danail - Taha, Abdel Aziz - van der Sommen, Fons - Wang, Ching-Wei - Weber, Marc-André - Zheng, Guoyan - Jannin, Pierre - Kopp-Schneider, Annette
PY  - 2018
TI  - Why rankings of biomedical image analysis competitions should be interpreted with care
JF  - Nature Communications
VL  - 9
IS  - 5217
SP  - 1-13
EP  - 1-13
PB  - Nature Publishing Group
SN  - 20411723
KW  - biomedical image analysis
KW  - benchmarking
KW  - challenge
UR  - http://doi.org/10.1038/s41467-018-07619-7
L2  - http://doi.org/10.1038/s41467-018-07619-7
N2  - International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.
ER  -

Basic information
Original name	Why rankings of biomedical image analysis competitions should be interpreted with care
Authors	MAIER-HEIN, Lena, Matthias EISENMANN, Annika REINKE, Sinan ONOGUR, Marko STANKOVIC, Patrick SCHOLZ, Tal ARBEL, Hrvoje BOGUNOVIC, Andrew BRADLEY, Aaron CARASS, Carolin FELDMANN, Alejandro FRANGI, Peter FULL, Bram VAN GINNEKEN, Allan HANBURY, Katrin HONAUER, Michal KOZUBEK (203 Czech Republic, guarantor, belonging to the institution), Bennett LANDMAN, Keno MÄRZ, Oskar MAIER, Klaus MAIER-HEIN, Bjoern MENZE, Henning MÜLLER, Peter NEHER, Wiro NIESSEN, Nasir RAJPOOT, Gregory SHARP, Korsuk SIRINUKUNWATTANA, Stefanie SPEIDEL, Christian STOCK, Danail STOYANOV, Abdel Aziz TAHA, Fons VAN DER SOMMEN, Ching-Wei WANG, Marc-André WEBER, Guoyan ZHENG, Pierre JANNIN and Annette KOPP-SCHNEIDER.
Edition	Nature Communications, Nature Publishing Group, 2018, 2041-1723.

Other information
Original language	English
Type of outcome	Article in a journal
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Switzerland
Confidentiality degree	is not subject to a state or trade secret
WWW	URL
Impact factor	Impact factor: 11.878
RIV identification code	RIV/00216224:14330/18:00101338
Organization unit	Faculty of Informatics
Doi	http://dx.doi.org/10.1038/s41467-018-07619-7
UT WoS	000452282700012
Keywords in English	biomedical image analysis; benchmarking; challenge
Tags	cbia-web
Tags	International impact, Reviewed
Changed by	Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 31/12/2018 08:54.

Abstract

International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.

Links
GBP302/12/G157, research and development project	Name: Dynamika a organizace chromosomů během buněčného cyklu a při diferenciaci v normě a patologii
GBP302/12/G157, research and development project	Investor: Czech Science Foundation
LTC17016, research and development project	Name: Benchmarking algoritmů segmentace a sledování buněk
LTC17016, research and development project	Investor: Ministry of Education, Youth and Sports of the CR, Benchmarking of algorithms for cell segmentation and tracking, INTER-COST

PrintDisplayed: 12/10/2024 19:46

Why rankings of biomedical image analysis competitions should be interpreted with care

Other applications