Outline Second version - mantono/syno GitHub Wiki

Synonym Graph (undirected)

Create a graph for finding synonyms or words with similar semantic meaning with the use of distributional semantics.

Sequence Graph (directed)

Create a graph for word sequences, but not complete n-grams. Instead, only keep track of the direct sequences between two words.

Combine The Graphs

Combine the synonym graph and the sequence graph to build/generate/identify longer sequences through the combination of data in both graphs.

Example

  1. Choose a word; Black
  2. Lookup the word black in the sequence graph. Which other words are found to follow right after that word? Get the N most common occurrences.
  3. Lookup the word black in the synonym graph. Which words are most common to occur close to black in sentences? Get the N most common occurrences.
  4. Get the union of these two sets from the two graphs. Chose one of the frequently occurring words.
  5. Repeat step 2 to 4 with the new word - but keep some (or all) of the words that were found from the synonym graph for the previous word?