DocIdUtil - apache/ctakes GitHub Wiki
Utility class for fetching document id
Check the jcas for a document id. Unlike {@link #getDeepDocumentId(JCas)}, this method does not progress into deeper jcas layers/views.
-
Parameters:
-
jcasye olde ...
-
-
Returns: the document id contained in the type "DocumentID" or {@link #NO_DOCUMENT_ID}
Gets the document Id by progressing through 3 layers until an Id is found: starting JCas, Initial View, Plaintext View
-
Parameters:
-
startingJcasinitial JCas to start the checking
-
-
Returns: Document Id from the starting JCas, the Initial View, the Plaintext View, or {@link #NO_DOCUMENT_ID}
Create a unique id for the document that can be used for an output filename or url. Will be the source document file name if possible, otherwise the first 10 characters of the text plus text hashcode, or "Unknown_" and the current millis if there is no text. Non-alphanumeric characters are replaced with '_'.
-
Parameters:
-
jcas-
-
-
Returns: an ok document id
-
Parameters:
-
jCas-
-
-
Returns: {@link #NO_DOCUMENT_ID} plus an index based upon the number of documents without IDs fetched with this class.
This may lead to documents having ids indexed out of order with respect to the order in which they were run.
Check the jcas for a document id prefix. Unlike {@link #getDeepDocumentId(JCas)}, this method does not progress into deeper jcas layers/views.
-
Parameters:
-
jcasye olde ...
-
- Returns: the document id prefix contained in the type "DocumentIdPrefix" or {@link #NO_DOCUMENT_ID}