Feature Linking - PPilger/text-detection GitHub Wiki

The target of feature linking is to group features (letters, wordfragments or whole words) to bigger features (words or groups of words).

General

A graph is created including all valid links (the edges) and all features (the vertices). Valid edges are determined by LinkingRule objects (for more details look here):

  • AreaGrowthLinkingRule
  • BoxDistanceLinkingRule
  • CenterDistanceLinkingRule
  • FixedDirectionLinkingRule

Approach 1: Establish all valid links

The connected components of the graph are used as the new features.

Evaluation

This approach only works well if the graph can be reduced so that the connected components equal words or word-groups in the image.

For example if the distance between labels is large enough.

Implementation

  • SimpleFeatureLinker

Approach 2: Find the best angle to connect features

  1. For every feature the best possible linkage is determined. This is done by scanning through a set of angles and taking the one where the most features can be linked.

  2. Remove duplicate features. If two LinkedFeature objects have a (sub-) feature in common, it is removed from the one with the lower ranking (the ranking determines how "good" the feature is).

Evaluation

This approach works quite well in all possible situations (using reasonable linking rules).

The only disadvantage is that it needs more time to finish than approach 1.

Implementation

  • BestDirectionFeatureLinker