MONAI_Layer_Factory_Design_Discussion - Project-MONAI/MONAI GitHub Wiki

Layer factories

Discussion:

BenM: A case for layer factories
- The pytorch dimension-specific / agnostic design inconsistency raises some issues that we should attempt to address. In particular, pytorch doesn't give you the tools that allow you to define networks that can be instantiated for multiple dimensions. Implementors of networks that should work in multiple dimensionalities (1D / 2D / 3D) must do so through an implementor-designed mechanism.
- As we are implementing networks as well as layers, we should find a best-practice approach that allows networks to be defined for multiple dimensionalities
- Choices:
  - Selection of dimension on a per-call basis
    - Select dimension through tensor dimensionality, optionally with a further hint
      - Advantages:
        
        Always works
      - Disadvantages:
        
        Per-call evaluation is unnecessary
        
        Rebuilding the network on a per-call basis is almost certainly harmful to performance
  - Selection of dimension on an instantiation basis
    - Select dimension at instantiation time; tensor then validated for appropriate dimensionality
      - Advantages:
        
        No overhead at run-time
      - Disadvantages:
        
        Edge-case might conceivably exist where the network runs on different dimensionalities of data on a per-call basis
        
        What else might need changing for such a network to be feasible?
- Personally, I feel strongly that instantiation-time is the correct time to fix dimensionality of networks, and so the rest of my comments proceed assuming that design decision
- Mechanism:
  - Named __init__ parameters
    - Each layer has a function that can be changed by assigning a layer type to it. This is called by the network when instantiates the appropriate layer
  - Layer factory
    - Given that most of the variation in terms of what layers are required is predicated on dimensionality, layer factories for given dimensions could be provided that return the appropriate layers for a given network
    - Is this true? What about:
      - activation functions
      - normalisation functions
      - strides for convolutions
      - replacement block architectures for higher-order entities like convolutional blocks