Converted network configuration - DigitalMediaProfessionals/dv-sdk GitHub Wiki

The convertor generates two files, name_gen.cpp and name_gen.h, where name is the value defined in the INPUT section of the .ini file, typically the name of the network. It is strongly advised to generate the documentation related to the generated source code for more detailed information.

Source code documentation

The documentation is using Doxygen. When running the tool, a doc folder is created, as well as the doxygen configuration files:

Doxyfile: the doxygen configuration file.
pages.dox: Front page of the doxygen documentation, containing a summary of the network conversion.
name.dot: Dot file defining the graph of the converted network; will be included directly in the documentation (front page).

To generate the documentation under Windows console, the following command can be run, assuming doxygen is installed, from the doc folder:

$ cd CaffeMobileNet/doc
$ doxygen

The result of the documentation is available at CaffeMobileNet/doc/html/html/index.html, which can be opened using any browser.

Source code of network configuration

The convertor generate source code of network configuration, which currently contains register settings of convolution and fully connected blocks. A short overview of the generated source code is given below using the CaffeMobileNet as example.

If one checks the network class definition, for instance in CaffeMobileNet_gen.h, it is defined as follows:

#pragma once

#include "dmp_network.h"

class CCaffeMobileNet : public CDMP_Network
{...}

The class CDMP_Network, defined in application/common/include/dmp_network.h, acts as a network layer utility and hide the memory allocations, registers writes, as well as the execution of the network and setting the input and reading the output. See Network class interface for the detailed documentation.

The CCaffeMobileNet class only defines method that are relevant for this particular network. The complete class definition is then defined as:

class CCaffeMobileNet : public CDMP_Network {
 private:
  void Layer_0();
  void Layer_1();
  //...
  void Layer_28();
  void Layer_29();

 public:
  virtual bool Initialize();
  CCaffeMobileNet();
  virtual ~CCaffeMobileNet();
};

The method initialize() will initialize the network and should be called after the network is created.

The methods Layer_*() defines all the layer configurations. For example, the first layer is defined as follow in CaffeMobileNet_gen.cpp:

//Layer_0: Convolution Layer
//  ->: conv1
//  ->: conv1/bn
//  ->: conv1/scale
//  ->: relu1
void CCaffeMobileNet::Layer_0() {
  dmp_dv_cmdraw_conv_v0& conf = get_layer(0).conv_conf;
  conf.header.size = sizeof(conf);
  conf.header.device_type = DMP_DV_DEV_CONV;
  conf.header.version = 0;
  // Topo: 00000000000000000000000000000001
  conf.topo = 0x1;  // [31:0] Output Destination of each run, 0 = UBUF, 1 = EXTMEM

  // Input Configuration:
  conf.w = 224;  // Input Width
  conf.h = 224;  // Input Height
  conf.z = 1;  // Input Depth
  conf.c = 3;  // Input Channels
  conf.input_buf.mem = io_mem_;
  conf.input_buf.offs = 0;

  // Output Configuration:
  conf.output_buf.mem = io_mem_;
  conf.output_buf.offs = 802816;

  conf.eltwise_buf.mem = NULL;
  conf.eltwise_buf.offs = 0;  // Input byte address for elementwise add (0 = UBUF Input Buffer)
  conf.output_mode = 0;  // 0 = concat, 1 = eltwise add

  // Runs Configuration:
  // ->1 run(s)
  //--------------------------------------------------
  //RUN : 0
  //--------------------------------------------------
  //->: conv1
  //->: conv1/bn
  //->: conv1/scale
  //->: relu1
  conf.run[0].m = 32;  // Output Channels
  conf.run[0].conv_enable = 1;  // 1 = Enabled, 0 = Disabled
  conf.run[0].p = 0x3;  // Filter Width and Height
  conf.run[0].pz = 1;  // Filter Depth
  conf.run[0].weight_buf.mem = weights_mem_;
  conf.run[0].weight_buf.offs = 0;
  conf.run[0].weight_fmt = 3;  // Weight format (0 = random access blocks, 1 = compact stream, 3 = 8-bit qunatized stream)
  conf.run[0].conv_pad = 0x1010101;  // bits [7:0] = left padding, bits [15:8] = right padding, bits [23:16] = top padding, bits [31:24]$
  conf.run[0].conv_stride = 0x202;  // bits [7:0] = X stride, bits [15:8] = Y stride
  conf.run[0].conv_dilation = 0x0;  // bits [7:0] = X dilation, bits [15:8] = Y dilation
  conf.run[0].pool_enable = 0;  // 0 = disabled, 1 = max pooling, 2 = average pooling
  conf.run[0].pool_size = 0x0;  // bits [7:0] = width, bits [15:8] = height
  conf.run[0].pool_stride = 0x101;  // bits [7:0] = X stride, bits [15:8] = Y stride
  conf.run[0].pool_pad = 0x0;  // bits [7:0] = left padding, bits [15:8] = right padding, bits [23:16] = top padding, bits [31:24] = bot$
  conf.run[0].pool_avg_param = 0x0;  // Usually set to 1/pool_size^2 in FP16 format when using average pooling (average pooling assumes $
  conf.run[0].actfunc = 2;  // Activation Function: 0 = None, 1 = Tanh, 2 = Leaky ReLU, 3 = Sigmoid, 4 = PReLU, 5 = ELU, 6 = ReLU6
  conf.run[0].actfunc_param = 0x0;  // Leaky ReLU parameter (NOTE: 0x2E66 is 0.1 in FP16)
  conf.run[0].rectifi_en = 0;  // Rectification, i.e. max(0, x) (NOTE: Can be applied after non-ReLU activation function)

  fpga_layer& layer = get_layer(0);
  layer.name = "conv1";
  layer.type = LT_CONV;
  layer.input_offs = 0;
  layer.output_offs = 802816;
  layer.output_size = 802816;
  layer.input_dim[0] = 224;
  layer.input_dim[1] = 224;
  layer.input_dim[2] = 3;
  layer.input_dim_size = 3;
  layer.output_dim[0] = 112;
  layer.output_dim[1] = 112;
  layer.output_dim[2] = 32;
  layer.output_dim_size = 3;
  layer.is_output = false;
  layer.is_f32_output = false;
  layer.is_input_hw_layout = false;
}//end of  Layer_0

The configuration is done by setting simple data structure and corresponding fields. The data structures are defined in the dv-user-driver/include/dmp_dv_cmdraw_v0.h by the struct dmp_dv_cmdraw_conv_v0.

The fpga_layer struct defines information of converted network. See Layer data structure for detailed documentation.

To use the actual generated network, see Converted Network Usage.