Evaluation, Reproducibility, Benchmarks Meeting 1 - Project-MONAI/MONAI GitHub Wiki
Minutes of meeting 1
Date: 25th May 2020
Membership: Kevin Zhou (Lead), Lena Maier-Hein (Lead), Nicola Rieke, Stephen Aylward, Jens Petersen, Paul Jäger, Annika Reinke, Carole Sudre, David Zimmerer, Dan Tudosiu
TOP 1: Introduction of attendees
Each member briefly introduced himself and described the personal interest in the benchmarking working group.
TOP 2: Introduction of MONAI
Stephen, Lena and Kevin introduced the concept of MONAI.
TOP 3: Existing initiatives
- Biomedical Image Analysis ChallengeS (BIAS) Initiative
- Papers with code: Example
- PyTorch Hub
- NeurIPS reproducibility challenge
- MLPerf 3D medical segmentation benchmark
TOP 4: What members would be willing to contribute
- Tool for visualization of benchmarking results (DKFZ-CAMI)
- Implementation of metrics (image wide, component-wise, multi-label) - classification segmentation regression (Carole)
TOP 5: What do we need?
- Metrics depending on tasks with implementation
- Comparability/Reproducibility:
- Scripts for validating on a specific data sets
- Semantic description of how the training was performed (including data sets used)
- Addressing randomness in training/testing (e.g. random seeds)
- Models as state-of-the-art/baseline methods (model zoo)
- Baselines for a challenge for novice DL researchers ("easy to use")
- Platform for benchmarking and publishing new methods
- Similar to papers with code (+ automated)
- Similar to open (post challenge) leaderboards for commonly used tasks/datasets
- Quality control for "MONAI certified" data sets
- Best practices for making trained model (+ "inference script") public
- Incentives for data sharing
- Infrastructure for participating in a challenge (including download scripts)
- Supporting challenge organization
- Best practices on reporting speed/memory requirements
- Network benchmarks (DL speed/memory benchmarks to evaluate/score hardware)
- Supporting fast inference (e.g. for docker-based evaluation)
- Identifying performance bottlenecks in pipelines
TOP 6: Group organization
- Slack channel (Paul, DKFZ): #project-monai-benchmark-wg
- Github (Jens, DKFZ)
- GoogleDrive: Use the one from MONAI (Lena)
- Find meeting slot (Lena):
- Wednesday, 2-3pm CEST monthly
- Next meeting: 1st July