Engine - langdoc/FRechdoc GitHub Wiki

This page documents conventions, standards and relevant workflows used for the annotation of our corpus data with the help of the Giellatekno toolkit, specifically an annotation engine incl. preprocessing-tokenizing, morphological analysis and disambiguation.

Intro

FST is…

Workflows

ELAN-->FST-->ELAN

Scripts

  • sending ELAN-data to the analyzer
  • sending analyzed data back into ELAN
  • creating new tiers and annotations on them based to search results on higher level tier