Feature Linking - PPilger/text-detection GitHub Wiki
The target of feature linking is to group features (letters, wordfragments or whole words) to bigger features (words or groups of words).
General
A graph is created including all valid links (the edges) and all features (the vertices). Valid edges are determined by LinkingRule
objects (for more details look here):
AreaGrowthLinkingRule
BoxDistanceLinkingRule
CenterDistanceLinkingRule
FixedDirectionLinkingRule
Approach 1: Establish all valid links
The connected components of the graph are used as the new features.
Evaluation
This approach only works well if the graph can be reduced so that the connected components equal words or word-groups in the image.
For example if the distance between labels is large enough.
Implementation
SimpleFeatureLinker
Approach 2: Find the best angle to connect features
-
For every feature the best possible linkage is determined. This is done by scanning through a set of angles and taking the one where the most features can be linked.
-
Remove duplicate features. If two
LinkedFeature
objects have a (sub-) feature in common, it is removed from the one with the lower ranking (the ranking determines how "good" the feature is).
Evaluation
This approach works quite well in all possible situations (using reasonable linking rules).
The only disadvantage is that it needs more time to finish than approach 1.
Implementation
BestDirectionFeatureLinker