Contributing - hexgrad/misaki GitHub Wiki

Although misaki is a powerful G2P toolkit, there is a lot of room for improvement. PRs are welcome.

AI-generated code is acceptable if and only if it has been tested.

Adding a new language

  • To add G2P for a new language, you would define a callable Python class that implements __init__ and __call__.
  • __init__ should set up heavy resources such as dictionaries.
  • __call__ takes a string of graphemes and returns a pair: a string of phonemes (required), and a sequence of MToken (optional).
  • Refer to the README to see how first-class G2P solutions have been implemented in other languages.
  • Because each language gets its own Python G2P class and language-specific dependencies, you can freely develop in your specific language without impacting the functionality of other languages. As a result, if you are the sole maintainer for your language, PRs will likely be rubber-stamp approved.

Token-level alignments

  • Some languages (English, Japanese, Chinese) implement token-level alignments, but many still do not. This means instead of just string-in-string-out, the G2P class also returns a sequence of MToken on call.
  • This enables quality of life improvements often associated with TTS, such as smarter chunking, auto-scrolling, and word highlighting.

Performance and code cleanup

  • Unified G2P interface.
  • English tokenizer can probably be made much faster and more memory efficient.
  • Add unit tests.

Cross-platform ports

  • Python allows for fast development, but porting to other languages may enable easier deployment to edge devices.

…to be continued…