Limitations of user defined features - Horsmann/FlexTag GitHub Wiki

Defining and using of feature extractors have a few limitations which are addressed here:

Self-containedness

Feature extractors are assumed to be self-contained in a sense that no dependencies to program code in the user's classpath exist.

The violation of this assumption has consequences when saving a model. The model or the feature would only work as long (somewhere) in the current classpath this user-defined program code is available. FlexTag serialises the feature extractor when saving a model, but not the user-defined program code on which the feature extractor might depend. It is assumed that all needed dependencies are available in the classpath of the user running FlexTag with the trained model.

This is most easily ensured by not creating dependencies to 3rd party code. A large amount of popular utilities (e.g. apache-commons) are already available in FlexTag, or in DKPro TC / Core on which FlexTag is based.

If yet external 3rd party program code needs to be used, copying the needed code-snippets into the respective feature extractor is an option. On this way it would be shipped as part of the feature extractors although this will cause code redundancy.

Last but not least, if 3rd party code must be incorporated the respective code should be at least available as maven artefact in order to add the 3rd party dependency to a project's classpath. Otherwise the trained model will depend on the users local configuration and cannot be shared.

If you do not intend to build models that can be given away you are not affected by this limitation.