CFG Parameters in the different layers - Sudhakar17/darknet GitHub Wiki
CFG-Parameters in the different layers
Image processing [N x C x H x W]:
-
[convolutional]- convolutional layer-
batch_normalize=1- if1- will be used batch-normalization, if0will not (0 by default) -
filters=64- number of kernel-filters (1 by default) -
size=3- kernel_size of filter (1 by default) -
groups = 32- number of groups for grouped-convolutional (depth-wise) (1 by default) -
stride=1- stride (offset step) of kernel filter (1 by default) -
padding=1- size of padding (0 by default) -
pad=1- if1will be usedpadding = size/2, if0the will be used parameterpadding=(0 by default) -
dilation=1- size of dilation (1 by default) -
activation=leaky- activation function after convolution:logistic (by default), loggy, relu, elu, selu, relie, plse, hardtan, lhtan, linear, ramp, leaky, tanh, stair, relu6, swish, mish
-
-
[activation]- separate activation layeractivation=leaky- activation function:linear (by default), loggy, relu, elu, selu, relie, plse, hardtan, lhtan, linear, ramp, leaky, tanh, stair
[batchnorm]- separate Batch-normalization layer
-
[maxpool]- max-pooling layer (the maximum value)-
size=2- size of max-pooling kernel -
stride=2- stirde (offset step) of max-pooling kernel
-
[avgpool]- average pooling layer inputW x H x C-> output1 x 1 x C
-
[shortcut]- residual connection (ResNet)-
from=-3,-5- relative layer numbers, preforms element-wise adding of several layers: previous-layer and layers specified infrom=parameter -
weights_type=per_feature- will be used weights for shortcuty[i] = w1*layer1[i] + w2*layer2[i] ...per_feature- 1 weights per layer/featureper_channel- 1 weights per channelnone- weights will not be used (by default)
-
weights_normalization=softmax- will be used weights normalizationsoftmax- softmax normalizationrelu- relu normalizationnone- without weights normalization - unbound weights (by default)
-
activation=linear- activation function after shortcut/residual connection (linear by default)
-
-
[upsample]- upsample layer (increase W x H resolution of input by duplicating elements)stride=2- factor for increasing both Width and Height (new_w = w*stride,new_h = h*stride)
-
[scale_channels]- scales channels (SE: squeeze-and-excitation blocks) or (ASFF: adaptively spatial feature fusion) -it multiplies elements of one layer by elements of another layer-
from=-3- relative layer number, performs multiplication of all elements of channelNfrom layer-3, by one element of channelNfrom the previous layer-1(i.e.for(int i=0; i < b*c*h*w; ++i) output[i] = from_layer[i] * previous_layer[i/(w*h)];) -
scale_wh=0- SE-layer (previous layer 1x1xC),scale_wh=1- ASFF-layer (previous layer WxHx1) -
activation=linear- activation function after scale_channels-layer (linear by default)
-
-
[sam]- Spatial Attention Module (SAM) - it multiplies elements of one layer by elements of another layerfrom=-3- relative layer number (this and previous layers should be the same size WxHxC)
-
[reorg3d]- reorg layer (resize W x H x C)-
stride=2- ifreverse=0input will be resized toW/2 x H/2x C4, ifreverse=1thenW2 x H*2 x C/4`, (1 by default) -
reverse=1 - if0(by default) then decrease WxH, if1thenincrease WxH (0 by default)
-
-
[reorg]- OLD reorg layer from Yolo v2 - has incorrect logic (resize W x H x C) - depracated-
stride=2- ifreverse=0input will be resized toW/2 x H/2x C4, ifreverse=1thenW2 x H*2 x C/4`, (1 by default) -
reverse=1 - if0(by default) then decrease WxH, if1thenincrease WxH (0 by default)
-
-
[route]- concatenation layer,Concatfor several input-layers, orIdentityfor one input-layerlayers = -1, 61- layers that will be concatenated, output:WxHxC_layer_1 + C_layer_2- if
index < 0, then it is relative layer number (-1means previous layer) - if
index >= 0, then it is absolute layer number
- if
-
[yolo]- detection layer for Yolo v3 / v4-
mask = 3,4,5- indexes ofanchorswhich are used in this [yolo]-layer -
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326- initial sizes if bounded_boxes that will be adjusted -
num=9- total number of anchors -
classes=80- number of classes of objects which can be detected -
ignore_thresh = .7- keeps duplicated detectionsif IoU(detect, truth) > ignore_thresh, which will be fused during NMS (is used for training only) -
truth_thresh = 1- adjusts duplicated detectionsif IoU(detect, truth) > truth_thresh, which will be fused during NMS (is used for training only) -
jitter=.3- randomly crops and resizes images with changing aspect ratio from x(1 - 2*jitter)to x(1 + 2*jitter)(data augmentation parameter is used only from the last layer) -
random=1- randomly resizes network for each 10 iterations from1/1.4to1.4(data augmentation parameter is used only from the last layer) -
resize=1.5- randomly resizes image in range:1/1.5 - 1.5x -
max=200- maximum number of objects per image during training -
counters_per_class=100,10,1000- number of objects per class in Training dataset to eliminate the imbalance -
label_smooth_eps=0.1- label smoothing -
scale_x_y=1.05- eliminate grid sensitivity -
iou_thresh=0.2- use many anchors per object ifIoU(Obj, Anchor) > 0.2 -
iou_loss=mse- IoU-loss:mse, giou, diou, ciou -
iou_normalizer=0.07- normalizer for delta-IoU -
cls_normalizer=1.0- normalizer for delta-Objectness -
max_delta=5- limits delta for each entry
-
-
[crnn]- convolutional RNN-layer (recurrent)-
batch_normalize=1- if1- will be used batch-normalization, if0will not (0 by default) -
size=1- convolutional kernel_size of filter (1 by default) -
pad=0- if1will be usedpadding = size/2, if0the will be used parameterpadding=(0 by default) -
output = 1024- number of kernel-filters in one output convolutional layer (1 by default) -
hidden=1024- number of kernel-filters in two (input and hidden) convolutional layers (1 by default) -
activation=leaky- activation function for each of 3 convolutional-layers in the [crnn]-layer (logistic by default)
-
-
[conv_lstm]- convolutional LSTM-layer (recurrent)-
batch_normalize=1- if1- will be used batch-normalization, if0will not (0 by default) -
size=3- convolutional kernel_size of filter (1 by default) -
padding=1- convolutional size of padding (0 by default) -
pad=1- if1will be usedpadding = size/2, if0the will be used parameterpadding=(by default) -
stride=1- convolutional stride (offset step) of kernel filter (1 by default) -
dilation=1- convolutional size of dilation (1 by default) -
output=256- number of kernel-filters in each of 8 or 11 convolutional layers (1 by default) -
groups=4- number of groups for grouped-convolutional (depth-wise) (1 by default) -
state_constrain=512- constrains LSTM-state values [-512; +512] after each inference (time_steps*32by default) -
peephole=0- if1then will be used Peephole (additional 3 conv-layers), if0will not (1 by default) -
bottleneck=0- if1then will be used reduced optimal versionn of conv-lstm layer -
activation=leaky- activation function for each of 8 or 11 convolutional-layers in the [conv_lstm]-layer (linear by default) -
lstm_activation=tanh- activation for G (gate:g = tanh(wg + ug)) and C (memory cell:h = o * tanh(c))
-

Free-form data processing [Inputs]:
[connected]- fully connected layeroutput=256- number of outputs (1 by default), so number of connections is equal toinputs*outputsactivation=leaky- activation after layer (logistic by default)
[dropout]- dropout layer-
probability=0.5- dropout probability - what part of inputs will be zeroed (0.5 = 50% by default) -
dropblock=1- use as DropBlock -
dropblock_size_abs=7- size of DropBlock in pixels 7x7
-
[softmax]- SoftMax CE (cross entropy) layer - Categorical cross-entropy for multi-class classification
-
[contrastive]- Contrastive loss layer for Supervised and Unsupervised learning (should be set[net] contrastive=1and optionally[net] unsupervised=1)-
classes=1000- number of classes -
temperature=1.0- temperature
-
[cost]- cost layer calculates (linear)Delta and (squared)Losstype=sse- cost type:sse(L2),masked,smooth(smooth-L1) (SSE by default)
[rnn]- fully connected RNN-layer (recurrent)batch_normalize=1- if1- will be used batch-normalization, if0will not (0 by default)output = 1024- number of outputs in one connected layer (1 by default)hidden=1024- number of outputs in two (input and hidden) connected layers (1 by default)activation=leaky- activation after layer (logistic by default)
[lstm]- fully connected LSTM-layer (recurrent)batch_normalize=1- if1- will be used batch-normalization, if0will not (0 by default)output = 1024- number of outputs in all connected layers (1 by default)
[gru]- fully connected GRU-layer (recurrent)batch_normalize=1- if1- will be used batch-normalization, if0will not (0 by default)output = 1024- number of outputs in all connected layers (1 by default)