NNVMv1 vs Relay

Q: when we write a pass on graph IR, whta's the benefits of Relay IR compared to nnvm graph IR?
A: the type information is part of the AST.
We support more flexible dtype and shape inferrence.
We have explicit variable bindings which is good for code generation and scoping issues.
We support abstraction(i.e. subgraphs/functions) we have support for control operators baked into the IR.
We support recursion, can encode loops in the IR.
We support shape polymorphism/dynamic shapes in the IR.
We also have a new AD algorithm(not yet merged) which can compute Nth order gradients over things like map, fold, etc.
We also support for data types coming up allowing us to define networks over lists, trees, etc
We also unified the attributed/parameters system from NNVM into TVM, we have well defined semantics for the entire IR in contrast to NNVM which had generic IR, and then semantics that were given to NNVM graphs by things like NNVM compiler/executor.
The IR also supports inline constants which can be arbitrary tvm.nd.NDarray. Meaning we don't need specialized operators for scalar/non-scalar/generic case. For example we can just use Relay interpreter to do constant evaluation

Relay IR

1. flexible dtype and shape inference
1. explicit variable binding
1. abstraction (i.e. subgraphs/functions)
1. control operators baked into the IR
1. we support recursion, can encode loops in the IR
1. shape polymorphism/dynamic shapes
1. Nth order gradients over things like map, fold, etc
1. data types coming up allowing us to define networks over lists, trees
1. inline constants

Relay RFC

https://github.com/dmlc/tvm/issues/1673

Relay 介绍

https://github.com/dmlc/tvm/pull/2324/files

Relay Code

python part

https://github.com/dmlc/tvm/tree/master/python/tvm/relay

head file

https://github.com/dmlc/tvm/tree/master/include/tvm/relay

c++ file

https://github.com/dmlc/tvm/tree/master/src/relay

test case

https://github.com/dmlc/tvm/tree/master/tests/python/relay

Front end

front end如何实现？nnvm是多个前端，支持不同框架，先转到自己的IR上，relay在这个问题上如何处理？
前端代码在Python部分，仍然是按照nnvm思路，先转到自己的ops(sym)，再用这些ops构建IR

支持其他框架格式输入

    def get_tvm_output(symbol, x, args, auxs, target, ctx, dtype='float32'):
        shape_dict = {"data": x.shape}
        if gluon_impl:
            new_sym, params = relay.frontend.from_mxnet(symbol, shape_dict)
        else:
            new_sym, params = relay.frontend.from_mxnet(symbol,
                                                        shape_dict,
                                                        arg_params=args,
                                                        aux_params=auxs)
        with relay.build_config(opt_level=3):
            graph, lib, params = relay.build(new_sym, target, params=params)
        m = graph_runtime.create(graph, lib, ctx)

当前支持mxnet前端，onnx和tf前端，PR已经有了，待合入

也支持从proto中直接加载

    class GraphProto(object):
        """A helper class for handling nnvm graph copying from pb2.GraphProto.
        Definition: https://github.com/onnx/onnx/blob/master/onnx/onnx.proto
        """

        def __init__(self):
            self._nodes = {}
            self._params = {}
            self._renames = {}
            self._num_input = 0
            self._num_param = 0

        def from_onnx(self, graph, opset):
            """Construct nnvm nodes from onnx graph.
            The inputs from onnx graph is vague, only providing "1", "2"...
            For convenience, we rename the `real` input names to "input_0",
            "input_1"... And renaming parameters to "param_0", "param_1"...

Relay语法的前端支持

https://github.com/dmlc/tvm/commit/d3bc59d20b7344cbfd4178deab678f79dd92220e#diff-31f7df8d1bba9d091da47e8da90975a6
使用ANTLR4来遍历Python的语法树，转到relay的IR表述上。
ANTLR4是一个很厉害的程序/库，可以用来生成Lexer和Paser，而且生成的接口非常易用。
https://abcdabcd987.com/notes-on-antlr4/

Core ops支持来自TVM

# _convert_map defines maps of name to converter functor(callable)
# for 1 to 1 mapping, use Renamer if nothing but name is different
# use AttrCvt if attributes need to be converted
# for 1 to N mapping(composed), use custom callable functions
# for N to 1 mapping, currently not supported(?)
def _get_convert_map(opset):
    return {
        # defs/experimental
        'Identity': Renamer('copy'),
        # 'Affine'
        'ThresholdedRelu': ThresholdedRelu.get_converter(opset),
        'ScaledTanh': ScaledTanh.get_converter(opset),
        'ParametricSoftplus': ParametricSoftPlus.get_converter(opset),
        'ConstantFill': ConstantFill.get_converter(opset),
        # 'GivenTensorFill'
        'FC': AttrCvt('dense', ignores=['axis', 'axis_w']),
        'Scale': Scale.get_converter(opset),
        # 'GRUUnit'
        # 'ATen'
        'ImageScaler': ImageScaler.get_converter(opset),
        # 'MeanVarianceNormalization'
        # 'Crop'
        # 'Embedding'
        'Upsample' : Upsample.get_converter(opset),
        'SpatialBN': BatchNorm.get_converter(opset),

        # defs/generator
        # 'Constant' # Implemented
        # 'RandomUniform'
        # 'RandomNormal'
        # 'RandomUniformLike'
        # 'RandomNormalLike'

        # defs/logical

        # defs/math
        'Add': Add.get_converter(opset),
        'Sub': Sub.get_converter(opset),
        'Mul': Mul.get_converter(opset),
        'Div': Div.get_converter(opset),
        'Neg': Renamer('negative'),
        'Abs': Absolute.get_converter(opset),
        'Reciprocal': Reciprocal.get_converter(opset),
        'Floor': Renamer('floor'),
        'Ceil': Renamer('ceil'),
        'Sqrt': Renamer('sqrt'),
        'Relu': Renamer('relu'),
        'LeakyRelu': Renamer('leaky_relu'),
        'Selu': Selu.get_converter(opset),
        'Elu': Elu.get_converter(opset),
        'Exp': Renamer('exp'),
        'Log': Renamer('log'),
        'Tanh': Renamer('tanh'),
        'Pow': Renamer('broadcast_pow'),
        'PRelu': Prelu.get_converter(opset),
        'Sigmoid': Renamer('sigmoid'),
        'HardSigmoid': HardSigmoid.get_converter(opset),
        'Max': Maximum.get_converter(opset),
        'Min': Minimum.get_converter(opset),
        'Sum': Sum.get_converter(opset),
        'Mean': Mean.get_converter(opset),
        'Clip': AttrCvt('clip', transforms={'min': 'a_min', 'max': 'a_max'}),
        # softmax default axis is different in onnx
        'Softmax': Softmax.get_converter(opset),
        'LogSoftmax': AttrCvt('log_softmax', {'axis': ('axis', 1)}),
        # 'Hardmax'
        'Softsign': Softsign.get_converter(opset),
        'SoftPlus': SoftPlus.get_converter(opset),
        'Gemm': Gemm.get_converter(opset),
        'MatMul': Renamer('matmul'),

        # defs/nn
        'AveragePool': AveragePool.get_converter(opset),
        'MaxPool': MaxPool.get_converter(opset),
        'Conv': Conv.get_converter(opset),
        'ConvTranspose': ConvTranspose.get_converter(opset),
        'GlobalAveragePool': Renamer('global_avg_pool2d'),
        'GlobalMaxPool': Renamer('global_max_pool2d'),
        'BatchNormalization': BatchNorm.get_converter(opset),
        # 'InstanceNormalization'
        # 'LpNormalization'
        'Dropout': AttrCvt('dropout', {'ratio': 'rate'}, ignores=['is_test']),
        'Flatten': Renamer('flatten'),
        'LRN': LRN.get_converter(opset),

        # defs/reduction
        'ReduceMax': AttrCvt('max', {'axes': 'axis'}),
        'ReduceMin': AttrCvt('min', {'axes': 'axis'}),
        'ReduceSum': AttrCvt('sum', {'axes': 'axis'}),
        'ReduceMean': AttrCvt('mean', {'axes': 'axis'}),
        # 'ReduceProd'
        # 'ReduceLogSumExp'
        'ArgMax': ArgMax.get_converter(opset),
        'ArgMin': ArgMin.get_converter(opset),

        # defs/tensor
        'Cast': Cast.get_converter(opset),
        'Reshape': Reshape.get_converter(opset),
        'Concat': Renamer('concatenate'),
        'Split': Split.get_converter(opset),
        'Slice': Slice.get_converter(opset),
        'Transpose': AttrCvt('transpose', {'perm': 'axes'}),
        'Gather': Gather.get_converter(opset),
        'Squeeze': AttrCvt('squeeze', {'axes': 'axis'}),
        'Unsqueeze': Unsqueeze.get_converter(opset),
        'Pad': Pad.get_converter(opset),
        'Shape': Shape.get_converter(opset),
    }

IR的定义

要看IR作用，为什么定义？
https://github.com/dmlc/tvm/blob/master/include/tvm/relay/expr.h
定义的表达式节点包括：

class ExprNode : public RelayNode
class ConstantNode : public ExprNode
class TupleNode : public ExprNode
class VarNode : public ExprNode
class GlobalVarNode : public ExprNode
class FunctionNode : public ExprNode
class CallNode : public ExprNode
class LetNode : public ExprNode
class IfNode : public ExprNode
class TupleGetItemNode : public ExprNode
class TempExprNode : public ExprNode

后端编译

这部分和nnvm思路类似：获取每个function（原来是子图）的compute和schedule，编译生成module
https://github.com/dmlc/tvm/tree/master/src/relay/backend
compile engine目前只支持单算子编译？

算子注册

op.cc
op.h

支持的pass

alter_op_layout.cc
canonicalize_ops.cc
combine_parallel_conv2d.cc
dead_code.cc
device_annotation.cc
expr_subst.cc
fold_constant.cc
fold_scale_axis.cc
forward_rewrite.cc
fuse_ops.cc
kind_check.cc
let_list.h
pass_util.h
pattern_util.h
simplify_inference.cc
type_infer.cc
type_solver.cc
util.cc
well_formed.cc

https://github.com/dmlc/tvm/tree/master/tests/python/relay

Type IR定义

In this paradigm, knowing all values are tensors allows compiler writers to design and implement optimizations over the AST in a uniform manner. For example, if a user of Relay wants to write an optimization that lifts a computation up one dimension, they can uniformly add a dimension without needing to handle scalar cases. This is very useful for optimizations that change dimension(e.g. auto-batching, spatial packing, or layout changes).

https://github.com/dmlc/tvm/blob/master/include/tvm/relay/type.h

Traverse Function(Graph) in Relay

For nnvm index graph, we can easily fetch input index, attrs, ..., of each node. The way I can figure out for similar task in Relay is to use ir_pass.post_order_visit. I have two questions:

I can create a dictionary to record index information for each nodes. How can I fetch attrs for CalNode? What is TupleGetItemNode for? In resnet50 expr, all TupleGetItemNodes just pick index 0. If a TupleGetItemNode is an input node of a CallNode, can I replace this TupleGetItemNode with the node selected by it? Or is there an easier way to do this?

For normal traversal, we can use post order visit. Since relay's Node is part of TVM's node system, you can access all the field, do you can directly do call_node.attrs to get the attribute, and do attrs.field_name to get the corresponding value.

The TupleGetItemNode is make for calls with multiple output, this is one difference between NNVMv1 and relay. NNVMv1 assumes every op have multiple output. Every op in relay has a single output, but that output can be a tuple, then we can use TupleGetItem to get the specific field of the tuple

How does Relay handle while loop

https://discuss.tvm.ai/t/how-does-relay-handle-while-loop/832/5
Our previous front-end layer supported translating forms of loops directly into the IR, and I plan on adding some helpers to IR builder for building loops.
The only thing that we can’t support currently is typing changing updates to a variable i.e if you start with a variable of type x : Tensor<(10, 5, 5), f32> you can’t say assign y : Tesnor<(5), f32> to it.
EDIT:
I noticed that the cases are flipped in the example program (which is only used for testing type checking):

       def f(n: i32, data: f32) -> f32 {
          if (n == 0) {
              return data;
          } else {
              return f(n - 1, log(data));
          }
       }
       # Main
       f(2, 10000);

训练和自动微分的社区状态

https://discuss.tvm.ai/t/status-of-gradients-and-training-in-relay/1195

We have a new version of it that has a more flexible method for ad over Relay programs, but the branch needs to be polished enough to be upstreamed to the TVM repository. We have been focused on inference but could prioritize upstreaming it. Let me sync with the other people working on Relay and get back to you.

Auto Diff

auto diff 已经提交RFC，代码还没有提交
https://github.com/dmlc/tvm/issues/2237

Code Review —— Relay IR - leozp/Myia-Issues GitHub Wiki

NNVMv1 vs Relay

Relay IR

Relay RFC

Relay 介绍

Relay Code

python part

head file

c++ file

test case

Front end

支持其他框架格式输入

也支持从proto中直接加载

Relay语法的前端支持

Core ops支持来自TVM

IR的定义

后端编译

算子注册

支持的pass

Type IR定义

Traverse Function(Graph) in Relay

How does Relay handle while loop

训练和自动微分的社区状态

Auto Diff

⚠️ GitHub.com Fallback ⚠️

Code Review —— Relay IR - leozp/Myia-Issues GitHub Wiki

NNVMv1 vs Relay

Relay IR

Relay RFC

Relay 介绍

Relay Code

python part

head file

c++ file

test case

Front end

支持其他框架格式输入

也支持从proto中直接加载

Relay语法的前端支持

Core ops支持来自TVM

IR的定义

后端编译

算子注册

支持的pass

Type IR定义

Traverse Function(Graph) in Relay

How does Relay handle while loop

训练和自动微分的社区状态

Auto Diff

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️