DNN_Creation - GateNLP/gateplugin-LearningFramework GitHub Wiki

Notes about automatically creating DNNs for learning problems

Learning problems

Place to keep notes about how we try to auto-generate the NN architectures from the meta file we get from the LF.

Currently we have the following fundamentally different learning problems:

  1. Regression: feature vector and numerica target: ignore for now
  2. Classification: feature vector and nominal target
  3. Sequence tagging: vector of feature vectors and vector (of same length) of classification targets

Each feature vector is a fixed-length (for all instances in the data) list where each list element is a "feature". The following kinds of features are possible

  • numeric
  • boolean
  • nominal (see below for how to deal with the encoding of nominal values)
  • ngram: this is a sequence of nominal values

Sequences

Sequences can occur in two different ways:

  • for sequence tagging problems: in this case the data file we get from the LF contains a list of feature vectors as the independent part and a list of targets as the dependent part and the isSequence parameter is set to 1. The gate-lf-python-data library converts the independent part which is a sequence of feature vectors into a list of feature sequences.
  • for ngram features within a feature vector: in that case the feature itself represents a sequence of nominal values.

Classification

For classification, each feature gets mapped to its own input layer:

  • A nominal value encoded as an embedding gets mapped to an embedding layer, or if we want both train and use pre-calculated embeddings, to an embeddings-mapping layer
  • A nominal value encoded as onehot gets mapped to a linear layer
  • A boolean value gets mapped to ??
  • A numeric value gets mapped to a one-unit linear layer
  • An ngram (which is represented as a list) gets mapped to one of these:
    • a convolutational layer: sequence of indices to sequence of embeddings to subsequent convolution to hidden units
    • a RNN layer: sequence of indices to sequence of embeddings for a LSTM/GRU to hidden units representing the final sequence states

Sequence tagging

TBD