Contributing - hexgrad/misaki GitHub Wiki
Although misaki is a powerful G2P toolkit, there is a lot of room for improvement. PRs are welcome.
AI-generated code is acceptable if and only if it has been tested.
Adding a new language
- To add G2P for a new language, you would define a callable Python class that implements
__init__
and__call__
. __init__
should set up heavy resources such as dictionaries.__call__
takes a string of graphemes and returns a pair: a string of phonemes (required), and a sequence ofMToken
(optional).- Refer to the README to see how first-class G2P solutions have been implemented in other languages.
- Because each language gets its own Python
G2P
class and language-specific dependencies, you can freely develop in your specific language without impacting the functionality of other languages. As a result, if you are the sole maintainer for your language, PRs will likely be rubber-stamp approved.
Token-level alignments
- Some languages (English, Japanese, Chinese) implement token-level alignments, but many still do not. This means instead of just string-in-string-out, the G2P class also returns a sequence of
MToken
on call. - This enables quality of life improvements often associated with TTS, such as smarter chunking, auto-scrolling, and word highlighting.
Performance and code cleanup
- Unified G2P interface.
- English tokenizer can probably be made much faster and more memory efficient.
- Add unit tests.
Cross-platform ports
- Python allows for fast development, but porting to other languages may enable easier deployment to edge devices.
…to be continued…