Minutes Data Working Group 18 Jun 2020 - Project-MONAI/MONAI GitHub Wiki

Agenda

(All) Review discussions from other workgroups on requirements and additional tasks
(Brad) Finish reviewing requirement document
(Stephen) Check in on working group chat

Meeting Minutes

Group reviewed the requirements for passing data into MONAI
- We need to strike the right balance between flexibility in supporting data types of all kinds (DICOM, HL7, other data inputs), but also flexible in supporting data structures of all kinds (BIDS, MSD, Clara, NiftyNet
  - It isn’t necessarily MONAI’s role to “pick a specification”
  - Researchers shouldn’t have to create/translate into a data format that forces them to do a lot more work with no value add
  - “As simple as possible, but no simpler”
- Challenge with datasets today (BIDS, MSD) is that
  - They have specified the dataset as part of the experiment run
  - The structure is simplified as it is geared to a specific use case, not a generic case
- Instead, let’s:
  - Separate “data set definition” and “experiment definition”, so that they can be supplied separately
  - Create a “transformer development framework” that defines how to take a dataset to be internally consumable
  - Create a set of “canned transformers”, that supports one for BIDS, one for MSD (supported by MONAI 0.2), ones for Clara Train and NiftyNet, so that researchers can bring their own data
    - And, Kaggle challenges etc can provide datasets in this supported format and gives the participants a place to start
  - The “experiment definition” will reference the data set definition, and include parameters to support specific runs, e.g.:
    - “These studies are for training, and these are for validation”
    - “Take a random seed of 80% for training, 20% for validation, for each label”

Next Steps

(All) Continue to iterate on this design
(Brad) Check with steering committee on outreach to BIDS
(Brad) Doodle poll for next meeting post-SIIM