Code Review —— Relay IR - leozp/Myia-Issues GitHub Wiki

NNVMv1 vs Relay

  • Q: when we write a pass on graph IR, whta's the benefits of Relay IR compared to nnvm graph IR?
  • A: the type information is part of the AST.
  • We support more flexible dtype and shape inferrence.
  • We have explicit variable bindings which is good for code generation and scoping issues.
  • We support abstraction(i.e. subgraphs/functions) we have support for control operators baked into the IR.
  • We support recursion, can encode loops in the IR.
  • We support shape polymorphism/dynamic shapes in the IR.
  • We also have a new AD algorithm(not yet merged) which can compute Nth order gradients over things like map, fold, etc.
  • We also support for data types coming up allowing us to define networks over lists, trees, etc
  • We also unified the attributed/parameters system from NNVM into TVM, we have well defined semantics for the entire IR in contrast to NNVM which had generic IR, and then semantics that were given to NNVM graphs by things like NNVM compiler/executor.
  • The IR also supports inline constants which can be arbitrary tvm.nd.NDarray. Meaning we don't need specialized operators for scalar/non-scalar/generic case. For example we can just use Relay interpreter to do constant evaluation

Relay IR

    1. flexible dtype and shape inference
    1. explicit variable binding
    1. abstraction (i.e. subgraphs/functions)
    1. control operators baked into the IR
    1. we support recursion, can encode loops in the IR
    1. shape polymorphism/dynamic shapes
    1. Nth order gradients over things like map, fold, etc
    1. data types coming up allowing us to define networks over lists, trees
    1. inline constants

Relay RFC

https://github.com/dmlc/tvm/issues/1673

Relay 介绍

https://github.com/dmlc/tvm/pull/2324/files

Relay Code

python part

https://github.com/dmlc/tvm/tree/master/python/tvm/relay

head file

https://github.com/dmlc/tvm/tree/master/include/tvm/relay

c++ file

https://github.com/dmlc/tvm/tree/master/src/relay

test case

https://github.com/dmlc/tvm/tree/master/tests/python/relay

Front end

  • front end如何实现?nnvm是多个前端,支持不同框架,先转到自己的IR上,relay在这个问题上如何处理?
  • 前端代码在Python部分,仍然是按照nnvm思路,先转到自己的ops(sym),再用这些ops构建IR

支持其他框架格式输入

    def get_tvm_output(symbol, x, args, auxs, target, ctx, dtype='float32'):
        shape_dict = {"data": x.shape}
        if gluon_impl:
            new_sym, params = relay.frontend.from_mxnet(symbol, shape_dict)
        else:
            new_sym, params = relay.frontend.from_mxnet(symbol,
                                                        shape_dict,
                                                        arg_params=args,
                                                        aux_params=auxs)
        with relay.build_config(opt_level=3):
            graph, lib, params = relay.build(new_sym, target, params=params)
        m = graph_runtime.create(graph, lib, ctx)

当前支持mxnet前端,onnx和tf前端,PR已经有了,待合入

也支持从proto中直接加载

    class GraphProto(object):
        """A helper class for handling nnvm graph copying from pb2.GraphProto.
        Definition: https://github.com/onnx/onnx/blob/master/onnx/onnx.proto
        """

        def __init__(self):
            self._nodes = {}
            self._params = {}
            self._renames = {}
            self._num_input = 0
            self._num_param = 0

        def from_onnx(self, graph, opset):
            """Construct nnvm nodes from onnx graph.
            The inputs from onnx graph is vague, only providing "1", "2"...
            For convenience, we rename the `real` input names to "input_0",
            "input_1"... And renaming parameters to "param_0", "param_1"...

Relay语法的前端支持

Core ops支持来自TVM

# _convert_map defines maps of name to converter functor(callable)
# for 1 to 1 mapping, use Renamer if nothing but name is different
# use AttrCvt if attributes need to be converted
# for 1 to N mapping(composed), use custom callable functions
# for N to 1 mapping, currently not supported(?)
def _get_convert_map(opset):
    return {
        # defs/experimental
        'Identity': Renamer('copy'),
        # 'Affine'
        'ThresholdedRelu': ThresholdedRelu.get_converter(opset),
        'ScaledTanh': ScaledTanh.get_converter(opset),
        'ParametricSoftplus': ParametricSoftPlus.get_converter(opset),
        'ConstantFill': ConstantFill.get_converter(opset),
        # 'GivenTensorFill'
        'FC': AttrCvt('dense', ignores=['axis', 'axis_w']),
        'Scale': Scale.get_converter(opset),
        # 'GRUUnit'
        # 'ATen'
        'ImageScaler': ImageScaler.get_converter(opset),
        # 'MeanVarianceNormalization'
        # 'Crop'
        # 'Embedding'
        'Upsample' : Upsample.get_converter(opset),
        'SpatialBN': BatchNorm.get_converter(opset),

        # defs/generator
        # 'Constant' # Implemented
        # 'RandomUniform'
        # 'RandomNormal'
        # 'RandomUniformLike'
        # 'RandomNormalLike'

        # defs/logical

        # defs/math
        'Add': Add.get_converter(opset),
        'Sub': Sub.get_converter(opset),
        'Mul': Mul.get_converter(opset),
        'Div': Div.get_converter(opset),
        'Neg': Renamer('negative'),
        'Abs': Absolute.get_converter(opset),
        'Reciprocal': Reciprocal.get_converter(opset),
        'Floor': Renamer('floor'),
        'Ceil': Renamer('ceil'),
        'Sqrt': Renamer('sqrt'),
        'Relu': Renamer('relu'),
        'LeakyRelu': Renamer('leaky_relu'),
        'Selu': Selu.get_converter(opset),
        'Elu': Elu.get_converter(opset),
        'Exp': Renamer('exp'),
        'Log': Renamer('log'),
        'Tanh': Renamer('tanh'),
        'Pow': Renamer('broadcast_pow'),
        'PRelu': Prelu.get_converter(opset),
        'Sigmoid': Renamer('sigmoid'),
        'HardSigmoid': HardSigmoid.get_converter(opset),
        'Max': Maximum.get_converter(opset),
        'Min': Minimum.get_converter(opset),
        'Sum': Sum.get_converter(opset),
        'Mean': Mean.get_converter(opset),
        'Clip': AttrCvt('clip', transforms={'min': 'a_min', 'max': 'a_max'}),
        # softmax default axis is different in onnx
        'Softmax': Softmax.get_converter(opset),
        'LogSoftmax': AttrCvt('log_softmax', {'axis': ('axis', 1)}),
        # 'Hardmax'
        'Softsign': Softsign.get_converter(opset),
        'SoftPlus': SoftPlus.get_converter(opset),
        'Gemm': Gemm.get_converter(opset),
        'MatMul': Renamer('matmul'),

        # defs/nn
        'AveragePool': AveragePool.get_converter(opset),
        'MaxPool': MaxPool.get_converter(opset),
        'Conv': Conv.get_converter(opset),
        'ConvTranspose': ConvTranspose.get_converter(opset),
        'GlobalAveragePool': Renamer('global_avg_pool2d'),
        'GlobalMaxPool': Renamer('global_max_pool2d'),
        'BatchNormalization': BatchNorm.get_converter(opset),
        # 'InstanceNormalization'
        # 'LpNormalization'
        'Dropout': AttrCvt('dropout', {'ratio': 'rate'}, ignores=['is_test']),
        'Flatten': Renamer('flatten'),
        'LRN': LRN.get_converter(opset),

        # defs/reduction
        'ReduceMax': AttrCvt('max', {'axes': 'axis'}),
        'ReduceMin': AttrCvt('min', {'axes': 'axis'}),
        'ReduceSum': AttrCvt('sum', {'axes': 'axis'}),
        'ReduceMean': AttrCvt('mean', {'axes': 'axis'}),
        # 'ReduceProd'
        # 'ReduceLogSumExp'
        'ArgMax': ArgMax.get_converter(opset),
        'ArgMin': ArgMin.get_converter(opset),

        # defs/tensor
        'Cast': Cast.get_converter(opset),
        'Reshape': Reshape.get_converter(opset),
        'Concat': Renamer('concatenate'),
        'Split': Split.get_converter(opset),
        'Slice': Slice.get_converter(opset),
        'Transpose': AttrCvt('transpose', {'perm': 'axes'}),
        'Gather': Gather.get_converter(opset),
        'Squeeze': AttrCvt('squeeze', {'axes': 'axis'}),
        'Unsqueeze': Unsqueeze.get_converter(opset),
        'Pad': Pad.get_converter(opset),
        'Shape': Shape.get_converter(opset),
    }

IR的定义

class ExprNode : public RelayNode
class ConstantNode : public ExprNode
class TupleNode : public ExprNode
class VarNode : public ExprNode
class GlobalVarNode : public ExprNode
class FunctionNode : public ExprNode
class CallNode : public ExprNode
class LetNode : public ExprNode
class IfNode : public ExprNode
class TupleGetItemNode : public ExprNode
class TempExprNode : public ExprNode

后端编译

算子注册

  • op.cc
  • op.h

支持的pass

alter_op_layout.cc
canonicalize_ops.cc
combine_parallel_conv2d.cc
dead_code.cc
device_annotation.cc
expr_subst.cc
fold_constant.cc
fold_scale_axis.cc
forward_rewrite.cc
fuse_ops.cc
kind_check.cc
let_list.h
pass_util.h
pattern_util.h
simplify_inference.cc
type_infer.cc
type_solver.cc
util.cc
well_formed.cc

Type IR定义

In this paradigm, knowing all values are tensors allows compiler writers to design and implement optimizations over the AST in a uniform manner. For example, if a user of Relay wants to write an optimization that lifts a computation up one dimension, they can uniformly add a dimension without needing to handle scalar cases. This is very useful for optimizations that change dimension(e.g. auto-batching, spatial packing, or layout changes).

Traverse Function(Graph) in Relay

For nnvm index graph, we can easily fetch input index, attrs, ..., of each node. The way I can figure out for similar task in Relay is to use ir_pass.post_order_visit. I have two questions:

I can create a dictionary to record index information for each nodes. How can I fetch attrs for CalNode? What is TupleGetItemNode for? In resnet50 expr, all TupleGetItemNodes just pick index 0. If a TupleGetItemNode is an input node of a CallNode, can I replace this TupleGetItemNode with the node selected by it? Or is there an easier way to do this?

For normal traversal, we can use post order visit. Since relay's Node is part of TVM's node system, you can access all the field, do you can directly do call_node.attrs to get the attribute, and do attrs.field_name to get the corresponding value.

The TupleGetItemNode is make for calls with multiple output, this is one difference between NNVMv1 and relay. NNVMv1 assumes every op have multiple output. Every op in relay has a single output, but that output can be a tuple, then we can use TupleGetItem to get the specific field of the tuple

How does Relay handle while loop

  • https://discuss.tvm.ai/t/how-does-relay-handle-while-loop/832/5

  • Our previous front-end layer supported translating forms of loops directly into the IR, and I plan on adding some helpers to IR builder for building loops.

  • The only thing that we can’t support currently is typing changing updates to a variable i.e if you start with a variable of type x : Tensor<(10, 5, 5), f32> you can’t say assign y : Tesnor<(5), f32> to it.

  • EDIT:

  • I noticed that the cases are flipped in the example program (which is only used for testing type checking):

       def f(n: i32, data: f32) -> f32 {
          if (n == 0) {
              return data;
          } else {
              return f(n - 1, log(data));
          }
       }
       # Main
       f(2, 10000);

训练和自动微分的社区状态

We have a new version of it that has a more flexible method for ad over Relay programs, but the branch needs to be polished enough to be upstreamed to the TVM repository. We have been focused on inference but could prioritize upstreaming it. Let me sync with the other people working on Relay and get back to you.

Auto Diff

⚠️ **GitHub.com Fallback** ⚠️