The challenge - grasseau/HAhRD GitHub Wiki

The Data set

The data set we will use is a list of 3D-interpolated "images" depicting the energy deposits in the HGCAL detector. In the next we call this data 3D-HGCAL images or 3D-images. These 3D-images have only one channel for each pixel (or voxel), it can be compared with a grey image but with real values (real16 or real32). This is very important to keep the floating value resolution in the implementation. No particular work is required in GSOC'19 on the data set, we will use running tools. Nevertheless, according to the work done or scheduling proposed by the student, we could change the way to generate the data set.

The core of the proposal

Base on Mask R-CNN implementation, we suggest to extend it to a 3D one with floating value for each pixel (not RGB integer values). In the following, we will call it 3D-floating operation. The original Mask R-CNN implementation runs on our GPUs(V100) platform with samples and our reduced problem HGCAL2D (projections on x-z, y-z planes of our 3D-image). This core task can be divided in sub-tasks, each of them must be checked, validated by running the whole process (the training and the evaluation):

  1. Extend to 3D-floating the ResNet50/ResNet101 part.
  2. Extend to 3D-floating the Feature Pyramid Network (FPN) module.
  3. Extend to 3D-floating the Region Proposal Network (RPN) with anchor extensions.
  4. Extent to 3D-floating to the remaining useful modules (I/O, configurations, etc.).
  5. Build synthetic events or tools to define bounding boxes and masks on real data to feed the training process
  6. Train and evaluate on a simple cases (3D HGCAL images)
  7. Extent to other tools : visualization, evaluation.

Optional

According to the student skills one or more tasks could be added in the student scheduling. By priority order:

  • Train on more complex cases
  • Change the FPN and RPN algorithm using the physics knowledge (the detector)
  • Add a regression module (to evaluate the energy for instance)
  • Optimize the efficiency of the whole process (training and evaluation).