MS–DPD Definition Files - sc-voice/ms-dpd GitHub Wiki

MS-DPD Definition Files

DPD headword entries are stored in MS-DPD definition JSON files. For example, the EN definition file is dpd/en/definition-en.mjs. The EN definition file is directly driven by DPD updates as described in MS–DPD Update Process.

Since MS-DPD is multi-lingual, it has other definition files such as dpd/en/definition-de.mjs. These are language-specific definition files.

MS-DPD has both language-specific definition files as well as language-independent definition files. For example, the pos or part-of-speech field is language-independent, whereas meaning_1 is language-dependent headword information. The language-independent definitions are stored in dpd/definition-pali.mjs.

Definitions are stored using JSON file format. Each definition file is a single Javascript object formatted with one row per line for change management. Field values are concatenated into a string using the vertical line character "|".

See DPD headwords table for information on the semantics of individual fields.

Headword keys

DPD uses numeric headword ids such as 12345. MS-DPD uses src/headword-key.mjs to encode/decode headword ids as compact 3-character headword keys. As DPD grows and headword ids proliferate up to numbers as large as 238327, MS-DPD will remain 3-characters. Headword keys are easier to remember than headword ids and take up less room in definition files.

Language-independent definitions

Language-independent headword information that is common to all contemporary languages is stored in dpd/definition-pali.mjs.

  • pattern
  • pos
  • construction
  • stem
  • lemma_1

Language-specific definitions

Language-independent headword information is stored in dpd/LANGCODE/definition-LANGCODE.mjs files, where LANGCODE is the two-letter ISO abbreviation for a language (e.g., en, pt, es, fr, de, ru).

Definition fields are:

  • meaning_1 reviewed meaning
  • meaning_raw unreviewed meaning from sources such as Buddhadatta or AI language translators. No value if meaning_1 is present.
  • meaning_lit literal meaning