3.9. Setup MMIR for Internationalization - dfki-flpe/sandbox-test GitHub Wiki

Setup the MMIR Applications for Internationalization

Only few steps are necessary to add a new dictionary to your application and set up MMIR to use a new language. These steps are described in the following.

Configuration

To configure your application using a specific language open the config/configuration.json file and edit the value for language. For example if you want to use English as the language of your application use en, for German use de.

Add a new Dictionary

The first step of supporting a new language in your application is to add a new dictionary to it. MMIR searches for dictionaries in config/languages/… and automatically loads all found dictionaries. MSK provides two example dictionaries for English and German language support (config/languages/en/dictionary.dic, config/languages/de/dictionary.dic).

Dictionaries are well-formatted JSON-Files. The example below illustrates the English dictionary of MSK:

{
	"mmig": "MMIG",
	"login_header": "Please login",
	"password_place_holder": "password",
	"user_name_place_holder": "user name",
	"login_label": "Login",
	"registration_text": "or register yourself",
	"registration_label": "Sign Up",
	"mainPanelAudioConfirmation": "Audio Confirmation",
	"buttonOk": "Ok",
	"buttonCancel": "Cancel",
	"buttonBack": "back",
	"ratingStar": "star",
	"ratingStars": "stars",
	"dialogCapture": "Capture",
	"dialogPlay": "Play",
	"welcome_header": "MMIG",
	"welcome_date": "some Date",
	"welcome_text":"Welcome to MMIG"
}

If you want to add support for French in your application: create a new dictionary file in the appropriate folder, i.e. config/languages/fr/dictionary.dic. The new dictionary will be available by its folder name, in this case fr. You can use this e.g. in the config/configuration.json file, or the LanguageManager.

For this example, first copy the contents of the English dictionary into the newly created file (or just copy the English dictionary file into the new sub-folder) and replace the English values with French translations:

{
  "mmig": "MMIG",
  "login_header": "S'il vous plaplaît connectez-vous",
  "password_place_holder": "mot de passe"
}

Adding Translations

Translations are looked up by keys defined in dictionaries.

  <label for="login">
    @localize("login_label")
  </label>

On rendering the value for the currently set language will be used to replace the statement. For the setting en as language and the example dictionary from above, the result would be:

  <label for="login">
     Login
  </label>

Add a new Grammar

If you want to use speech interactions in your application, you should also provide grammars for all languages. Grammars are used to process the ASR from a speech recognizer: it "translates" natural language input into "programmatic instructions", e.g. for input phrase please find movie XY, the result of executing the grammar could be something like

  {
    search: "XY",
    displayResult: true
  }

This is only an example, the concrete result is defined within the grammar and generally depends on the application, i.e. you will have to encode the mapping phrase &#8594; result by specifying the grammar.

attention > After creating a new language directory (in `config/languages/`) or after creating a new file in a language directory, you need to re-generate the file list `directories.json` - use the ANT task `generateFileListJSONFile` for automatically generating the `directories.json` file.

Grammar definitions in MMIR are similar to context free grammars: input sentences are matched against grammar rules (in MMIR: utterances/phrases). The grammar rules (MMIR: phrases), may refer to other rules and/or to tokens-definitions. The following example shows a pseudo grammar for illustration:

TOKEN1:	"some","few"
TOKEN2:	"thing","things","object","objects"
TOKEN3:	"else"
...
RULE1:  TOKEN1 TOKEN3 TOKEN1
RULE2:  RULE1 TOKEN2

MMIR searches for grammars in config/languages/… and automatically loads all found grammars. MSK provides an example grammar for German (config/languages/de/grammar.json).

Grammars are defined in the form of well-formatted JSON files. The following example is an excerpt from the German grammar.json of MSK and illustrates a possible grammar definition for "translating" natural language phrases like "bitte abspielen" (please play), "spiele ab" (play) etc. into an event object Play (i.e. the result that would be returned, when the grammar matches the phrase: {semantic: Play{}}). This event object can then be used for further processing (this example is an excerpt from the German de/grammar.json of MSK):

{
   "stop_word": [
        "bitte",
        "doch",
        "der",
        "m__oe__chte",
        ...
    ],
    "tokens": {
        "PREPOSITION": [
            "an",
            "um",
            "am",
            "ab",
            ...
        ],
        "V_PLAY_IMP": [
            "spiel",
            "spiele"
            ...
        ],
        "V_PLAY_INF": [
            "spielen",
            "abspielen",
            "h__oe__ren",
            ...
        ],
        ...
    },
    "utterances": {
	"PLAY": {
	    "phrases": [
                "V_PLAY_INF",
		   "V_PLAY_IMP PREPOSITION"
	    ],
	    "semantic": {
		"Play": {}
	    }
	 }
	 ...
    }
}

TBD: descriptions for following details

  • grammar definition details:
    • stopwords, tokens, utterances, semantic-resuts
    • lower-case "restriction": token values should all be lower case
    • umlaut encoding (e.g. ä → ae)
  • grammar loading/selection mechanism
  • grammar compilation (compile grammar.json with build.xml)

In the current version of MMIR we use the JS/CC parser. You can find a detailed instruction about how to define you JS/CC here. You can also define and test your grammar online at http://jscc.jmksf.com/jscc/jscc.html. There is an HTML file for testing the grammar in /assets/www/testSemanticInterpreter.html. The Ant build file /build.xml provides tasks for compiling an executable grammar from the JSON grammar file.

⚠️ **GitHub.com Fallback** ⚠️