Tool Internals - DigitalMediaProfessionals/dv-sdk GitHub Wiki

Tool script files organization

tool
│  convertor.py            : The main converter entry  script.
│
└─cnn_convertor
    │  __init__.py         : To mark this folder as a Python module
    │  caffe_pb2.py        : Automatically generated from caffe.proto using Protobuf compiler tool. DON'T EDIT IT!!
    │  cnn_exception.py    : Define exceptions used in the script.
    │  cnn_layer.py        : Define the internal Layer and Network data structures and their operations.
    │  cnn_parser.py       : Parser frontend, will call Caffe parser or Keras parser depends on the input model.
    │  parser_caffe.py     : Parser to parse Caffe model to tool internal representation.
    │  parser_keras.py     : Parser to parse Keras model to tool internal representation.
    │  fpga_layer.py       : Output the FPGA network configuration as C++ source codes and do weight packing.
    └─ cnn_docgen.py       : Generate doxygen configuration as the input of doxgen tool.

Tool processing flow

  1. Parse the input .ini configs. Implemented in convertor.py

  2. Parse the input network. Calls cnn_parser.parse_network in cnn_parser.py

    1. Call parser_caffe.parse_caffe_def in parser_caffe.py for Caffe network.

      Call parser_keras.parse_keras_network in parser_keras.py for Keras network.

    2. Call network.build_traverse_list in cnn_layer.py to build the traversal list of nodes for the converted network.

    3. Call network.calc_inout_sizes in cnn_layer.py to calculate input/output buffer dimensions of each node.

    4. If is Caffe network, calls parser_caffe.parse_caffe_data to attach weights to nodes. Weights in Keras networks don't need this step since the network definition and weights are in the same file.

  3. Build FPGA representation of the network by fpga_layer.FPGANetwork

    It will call FPGANetwork.convert_network to convert the network th FPGA format.

    1. convert_network will try to merge layers. Currently it tries the following rules:

      • Try to merge Max pooling layer with previous Convolution layer. Only merge when tiles of both layers is 1. (Tiles is a setting for FPGA HW. If the input buffer is too large, the internal memory can't store all of them, then it needs to be divided to more than 1 tiles to be processed.)
      • Try to merge simple branches. Only brach with depth <= 2 are considered. A simple branch looks like this (assuming the network flows from left to right):
               ------ B ------
              /               \
            A                  E
              \               /
               --- C --- D ---
      
    2. Then it will call connect_layers, which will analysis the live range of each input/output buffer. After then it will set buffer offsets for each layer, and tries to re-use memory spaces that are no longer used.

  4. fpga_net.output_network is called to output the FPGA network configurations as C++ source files, and also output packed weights.