Converted network configuration - DigitalMediaProfessionals/dv-sdk GitHub Wiki
The convertor generates two files, name_gen.cpp
and name_gen.h
, where name is the value defined in the INPUT
section of the .ini file, typically the name of the network. It is strongly advised to generate the documentation related to the generated source code for more detailed information.
Source code documentation
The documentation is using Doxygen. When running the tool, a doc folder is created, as well as the doxygen configuration files:
- Doxyfile: the doxygen configuration file.
- pages.dox: Front page of the doxygen documentation, containing a summary of the network conversion.
- name.dot: Dot file defining the graph of the converted network; will be included directly in the documentation (front page).
To generate the documentation under Windows console, the following command can be run, assuming doxygen is installed, from the doc folder:
$ cd CaffeMobileNet/doc
$ doxygen
The result of the documentation is available at CaffeMobileNet/doc/html/html/index.html
, which can be opened using any browser.
Source code of network configuration
The convertor generate source code of network configuration, which currently contains register settings of convolution and fully connected blocks. A short overview of the generated source code is given below using the CaffeMobileNet as example.
If one checks the network class definition, for instance in CaffeMobileNet_gen.h
, it is defined as follows:
#pragma once
#include "dmp_network.h"
class CCaffeMobileNet : public CDMP_Network
{...}
The class CDMP_Network
, defined in application/common/include/dmp_network.h
, acts as a network layer utility and hide the memory allocations, registers writes, as well as the execution of the network and setting the input and reading the output. See Network class interface for the detailed documentation.
The CCaffeMobileNet
class only defines method that are relevant for this particular network. The complete class definition is then defined as:
class CCaffeMobileNet : public CDMP_Network {
private:
void Layer_0();
void Layer_1();
//...
void Layer_28();
void Layer_29();
public:
virtual bool Initialize();
CCaffeMobileNet();
virtual ~CCaffeMobileNet();
};
The method initialize()
will initialize the network and should be called after the network is created.
The methods Layer_*()
defines all the layer configurations. For example, the first layer is defined as follow in CaffeMobileNet_gen.cpp
:
//Layer_0: Convolution Layer
// ->: conv1
// ->: conv1/bn
// ->: conv1/scale
// ->: relu1
void CCaffeMobileNet::Layer_0() {
dmp_dv_cmdraw_conv_v0& conf = get_layer(0).conv_conf;
conf.header.size = sizeof(conf);
conf.header.device_type = DMP_DV_DEV_CONV;
conf.header.version = 0;
// Topo: 00000000000000000000000000000001
conf.topo = 0x1; // [31:0] Output Destination of each run, 0 = UBUF, 1 = EXTMEM
// Input Configuration:
conf.w = 224; // Input Width
conf.h = 224; // Input Height
conf.z = 1; // Input Depth
conf.c = 3; // Input Channels
conf.input_buf.mem = io_mem_;
conf.input_buf.offs = 0;
// Output Configuration:
conf.output_buf.mem = io_mem_;
conf.output_buf.offs = 802816;
conf.eltwise_buf.mem = NULL;
conf.eltwise_buf.offs = 0; // Input byte address for elementwise add (0 = UBUF Input Buffer)
conf.output_mode = 0; // 0 = concat, 1 = eltwise add
// Runs Configuration:
// ->1 run(s)
//--------------------------------------------------
//RUN : 0
//--------------------------------------------------
//->: conv1
//->: conv1/bn
//->: conv1/scale
//->: relu1
conf.run[0].m = 32; // Output Channels
conf.run[0].conv_enable = 1; // 1 = Enabled, 0 = Disabled
conf.run[0].p = 0x3; // Filter Width and Height
conf.run[0].pz = 1; // Filter Depth
conf.run[0].weight_buf.mem = weights_mem_;
conf.run[0].weight_buf.offs = 0;
conf.run[0].weight_fmt = 3; // Weight format (0 = random access blocks, 1 = compact stream, 3 = 8-bit qunatized stream)
conf.run[0].conv_pad = 0x1010101; // bits [7:0] = left padding, bits [15:8] = right padding, bits [23:16] = top padding, bits [31:24]$
conf.run[0].conv_stride = 0x202; // bits [7:0] = X stride, bits [15:8] = Y stride
conf.run[0].conv_dilation = 0x0; // bits [7:0] = X dilation, bits [15:8] = Y dilation
conf.run[0].pool_enable = 0; // 0 = disabled, 1 = max pooling, 2 = average pooling
conf.run[0].pool_size = 0x0; // bits [7:0] = width, bits [15:8] = height
conf.run[0].pool_stride = 0x101; // bits [7:0] = X stride, bits [15:8] = Y stride
conf.run[0].pool_pad = 0x0; // bits [7:0] = left padding, bits [15:8] = right padding, bits [23:16] = top padding, bits [31:24] = bot$
conf.run[0].pool_avg_param = 0x0; // Usually set to 1/pool_size^2 in FP16 format when using average pooling (average pooling assumes $
conf.run[0].actfunc = 2; // Activation Function: 0 = None, 1 = Tanh, 2 = Leaky ReLU, 3 = Sigmoid, 4 = PReLU, 5 = ELU, 6 = ReLU6
conf.run[0].actfunc_param = 0x0; // Leaky ReLU parameter (NOTE: 0x2E66 is 0.1 in FP16)
conf.run[0].rectifi_en = 0; // Rectification, i.e. max(0, x) (NOTE: Can be applied after non-ReLU activation function)
fpga_layer& layer = get_layer(0);
layer.name = "conv1";
layer.type = LT_CONV;
layer.input_offs = 0;
layer.output_offs = 802816;
layer.output_size = 802816;
layer.input_dim[0] = 224;
layer.input_dim[1] = 224;
layer.input_dim[2] = 3;
layer.input_dim_size = 3;
layer.output_dim[0] = 112;
layer.output_dim[1] = 112;
layer.output_dim[2] = 32;
layer.output_dim_size = 3;
layer.is_output = false;
layer.is_f32_output = false;
layer.is_input_hw_layout = false;
}//end of Layer_0
The configuration is done by setting simple data structure and corresponding fields. The data structures are defined in the dv-user-driver/include/dmp_dv_cmdraw_v0.h
by the struct dmp_dv_cmdraw_conv_v0
.
The fpga_layer
struct defines information of converted network. See Layer data structure for detailed documentation.
To use the actual generated network, see Converted Network Usage.