Package: util - 11791-04/project-team04 GitHub Wiki

Util Package

A general package containing useful classes to do tasks which are outside the main UIMA pipeline, like text processing, language modeling, NER, stemming, counting, or querying the GoPubMed Web API.

Sub-Packages:

  • datastructure
  • text
  • text.counter
  • text.lm
  • text.ner
  • webservice

Package util


Class: QueryExpander

Method: expandQuery(String question, Stemmer stemmer) returns String
Called by some other thing to do stuff str.toString().

Method: expandQuery(String question, Set<String> stopwords, Stemmer stemmer) returns String
Called by some other thing to do stuff str.toString().


Class: TypeConstants

Bunch of static constants. Provided in the archetype.


Class: TypeFactory

Bunch of static factory methods for various UIMA types. Provided in the archetype.


Class: UIMA_Utils

More useful stuff. Provided in the archetype.


Package datastructure


Class: BetterMap

Method: addItem(K k, E q)
Called by some other thing to do stuff str.toString().


Class: Pair (implements Map.Entry)

Overrides and implements Map.Entry interface.


Package text


Class: TextProcessingTools

Method: getTFMap(String[] termArray) returns Map<String, Integer>
Called by some other thing to do stuff str.toString().

Method: getFormattedTermArray(String rawText, KrovetzStemmer stemmer) returns String[]
Called by some other thing to do stuff str.toString().


Class: TextUtils

Method: spaceTokenizer(String doc) returns List<String>
Called by some other thing to do stuff str.toString().

Method: minimalStem(String word) returns String
Returns the minimal stem of the word. Duh.

Method: porterStem(String word) returns String
See if you can guess what this does.

Method: stanfordTokenizer(String doc) returns List<String>
Uses Stanford Tokenizer to tokenize the document and return a list of String tokens.

Method: stanfordSentenceTokenizer(String doc) returns List<SentenceInfo>
Uses Stanford Tokenizer to tokenize the document by sentences and returns a list of SentenceInfo tokens.


Class: TypeUtil

From the archetype. Got a lotta useful static methods here.


Package text.counter


Class: FrequencyCounter (extends HashMap<String, Integer>)

Method: put(String key, Integer value) returns Integer
Called by some other thing to do stuff str.toString().

Method: putAll(Iterable<String> document)
Called by some other thing to do stuff str.toString().

Method: empty()
Empties the hash map.

Method: tokenizeAndPutAll(String doc, String delimiter)
Called by some other thing to do stuff str.toString().


Class: FrequencyCounterFactory

Method: getNewFrequencyCounter(String type) returns FrequencyCounter
Called by some other thing to do stuff str.toString().


Class: CleanStemCounter (extends FrequencyCounter)

Method: tokenizeAndPutAll(String doc, String delimiter)
Called by some other thing to do stuff str.toString().


Class: CleanStemStopWordCounter (extends FrequencyCounter)

Method: tokenizeAndPutAll(String doc, String delimiter)
Called by some other thing to do stuff str.toString().


Class: LemmatizeCounter (extends FrequencyCounter)

Method: tokenizeAndPutAll(String doc, String delimiter)
Called by some other thing to do stuff str.toString().


Class: StanfordLemmatizer

Method: doStuff(String str) Throws: Exception
Called by some other thing to do stuff str.toString().


Class: StemCounter (extends FrequencyCounter)

Method: tokenizeAndPutAll(String doc, String delimiter)
Called by some other thing to do stuff str.toString().


Class: StopWordCounter (extends FrequencyCounter)

Method: tokenizeAndPutAll(String doc, String delimiter)
Called by some other thing to do stuff str.toString().


Package text.lm


Class: Ngram

Method: getUnigram(String term) returns double
Called by some other thing to do stuff str.toString().


Package text.ner


Class: BioNER

Method: getBioTags(String content) returns Set<String>
Called by some other thing to do stuff str.toString().

Method: getUnigramBioTags(String content) returns Set<String>
Called by some other thing to do stuff str.toString().

Method: getBigramBioTags(String content) returns Set<String>
Called by some other thing to do stuff str.toString().

Method: getAbnerNER(String content) returns Set<GeneMentionTag>
Called by some other thing to do stuff str.toString().

Method: getLingPipeStatNER(String content) returns Set<GeneMentionTag>
Called by some other thing to do stuff str.toString().

Method: getLingPipeDictNER(String content) returns Set<GeneMentionTag>
Called by some other thing to do stuff str.toString().

Method: getPOSNER(String content) returns Set<GeneMentionTag>
Called by some other thing to do stuff str.toString().


Class: PosTagNamedEntityRecognizer

Method: getGeneSpans(String text) returns Map<Integer, Integer>
Called by some other thing to do stuff str.toString().


Package webservice


Class: WebAPIServiceProxyFactory

Method: getInstance() returns WebAPIServiceProxy
This method first checks if a GoPubMedService object has been instantiated. If so, it returns the already-instantiated object. If not, it creates a new GoPubMedService object and returns it. If a ConfigurationException occurs, this method throws a new UIMA_IllegalStateException. This method is synchronized.


Class: WebAPIServiceProxy

Method: getFindingsFromQuery(String query) returns: List<OntologyServiceResponse.Finding>
Based on String query this method makes all API calls to GoPubMedService that return Finding lists. Included calls are Disease ontology, Gene ontology, Jochem, MeSH, and UniProt.

Method: getPubMedDocumentsFromQuery(String query) Returns: List<PubMedSearchServiceResponse.Document>
Based on String query this method makes all API calls to GoPubMedService that return Document lists. Included calls are Pub Med Citations.

Method: getEntitiesFromQuery(String query) returns: List<LinkedLifeDataServiceResponse.Entity>
Based on String query this method makes all API calls to GoPubMedService that return Entity lists. Entity lists also contain lists of Relation objects. Calls to the getRelations() method will return them.


Class: CachedWebAPIServiceProxy

Same thing as the above but uses a cache. This is a transparent cache so users of the Factory should never know they are using it.

Class: GoPubMedServiceFactory

Method: doStuff(String str) Throws: Exception
Called by some other thing to do stuff str.toString().

Class: MetalWebService

Method: doStuff(String str) Throws: Exception
Called by some other thing to do stuff str.toString().


⚠️ **GitHub.com Fallback** ⚠️