Outline Second version - mantono/syno GitHub Wiki
Synonym Graph (undirected)
Create a graph for finding synonyms or words with similar semantic meaning with the use of distributional semantics.
Sequence Graph (directed)
Create a graph for word sequences, but not complete n-grams. Instead, only keep track of the direct sequences between two words.
Combine The Graphs
Combine the synonym graph and the sequence graph to build/generate/identify longer sequences through the combination of data in both graphs.
Example
- Choose a word; Black
- Lookup the word black in the sequence graph. Which other words are found to follow right after that word? Get the N most common occurrences.
- Lookup the word black in the synonym graph. Which words are most common to occur close to black in sentences? Get the N most common occurrences.
- Get the union of these two sets from the two graphs. Chose one of the frequently occurring words.
- Repeat step 2 to 4 with the new word - but keep some (or all) of the words that were found from the synonym graph for the previous word?