module__org.bibliome.alvisnlp.modules.tika.TikaReader - Bibliome/alvisnlp GitHub Wiki
#org.bibliome.alvisnlp.modules.tika.TikaReader
Reads PDF or DOC files and adds a document in the corpus for each file.
This module is experimental.
Optional
Type: SourceStream
Path to the source directory or source file.
Optional
Type: Mapping
UNDOCUMENTED
Optional
Type: Mapping
Constant features to add to each document created by this module
Optional
Type: Mapping
Constant features to add to each section created by this module
Default value: html
Type: String
Default value: text
Type: String
Name of the single section containing the whole contents of a file.
Default value: tag
Type: String