How dictionary data and index files work - melink14/rikaikun GitHub Wiki

Just notes about data files that I learned while cleaning up data.ts.

  • The data.dat file is just a alphabetical list of entries. It's read in as text and entries are extracted using substring method of string.
  • The index file contains a single line for each head word and a character offset for each entry in the dat file.
  • Words are found by doing a binary search in the index getting the offset and looking teh offset up in the dat file (with definflection)

Should consider just using a map if the memory usage is the same.