Tagger_SyntaxNet - GateNLP/gateplugin-Tagger_SyntaxNet GitHub Wiki
Processing Resource Tagger_SyntaxNet
Runtime parameters
containingAnnotationType
(String, no default): If this is pecified, then annotations of this type and from the input annotation set are used for identifying those spans in the document which should get annotated. The PR will create and exchange one request for each span with the server. This can e.g. be used to only annotated text without the boilerplate, or only annotate text of a specific language in a mixed-language document.inputAnnotationSet
(String, default is empty for the default annotation set): this is only relevant if thecontainingAnnotationType
parameter is specified in which case it is the annotation set which should contain the containing annotations.outputAnnotationSet
(String, default is empty for the default annotaiton set): annotation set where the new annotations will be added.serverAddress
(String, default is 127.0.0.1): the address/hostname of the host where the SyntaxNet server is runningserverPort
(Integer, default is 9000): the port number the SyntaxNet server uses
Output
The PR creates the following annotation types in the output annotation set:
Sentence
: for each Sentence found by the serverToken
: for each Token found by the server. Note that SyntaxNet only finds token and completely ignores any white space, so unlike with other GATE tokenisers, no "SpaceToken" is created.
The Token annotaitons contain the following features (NOTE: the fields category
and tag
have their content switched with respect to the fields created by SyntaxNet in order to be better compatible with GATE conventins!):
breaklevel
: the way how the preceding token is seperated from the current token. This is one of:NO_BREAK
SPACE_BREAK
LINE_BREAK
SENTENCE_BREAK
category
: this is the content of the fieldtag
SyntaxNet returns for each Token and contains the language-specific POS tagtag
: this is the content of the fieldcategory
SyntaxNet returns for each Token and contains the universal POS tag (http://universaldependencies.org/u/pos/)headId
: this is the annotation Id of another annotation which is the head of this annotation. For a ROOT token, this is the annotation id of the containing Sentence annotationlabel
: the label of the dependency parse arcword
: the original word string