Tag match condition - adamb924/mortal-engine GitHub Wiki

You can constrain morphemes using tags. You might want to do this to capture morphosyntactic facts. For instance, only verbs marked as transitive can receive the passive morpheme. You might also want to do it to capture phonological facts, such as whether an orthography doesn't provide enough detail for the pattern of allomorphy to be predictable. You can also use them to flag various exceptional forms. Tags are very powerful, but they are somewhat onerous to use, because they (usually) have to be specified manually.

This example, examples/04-Tag-Allomorphy.xml shows a Turkic vowel harmony system, using tags to determine which allomorph appears.

<?xml version="1.0" encoding="UTF-8"?>
<morphology
    xmlns="https://www.adambaker.org/mortal-engine"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="https://www.adambaker.org/mortal-engine morphology.xsd">
    <writing-systems src="writing-systems.xml"/>
    <model label="Nouns">
        <stem-list label="Stem">
            <filename>01-stems.xml</filename>
            <matching-tag>noun</matching-tag>
        </stem-list>
        <morpheme label="Plural">
            <optional/>
            <!-- Here we have two allomorphs for one morpheme. One will
                occur after front words, the other after back words. In 
                this example this is done with tags, rather than with
                a regular expression, but both are possible. -->
            <allomorph>
                <!-- This means, “this allomorph will match if the immediately preceding morpheme has all of the morphemes in the set {front}” -->
                <tag-match scope="immediately-preceding" type="all">
                    <match-tag>front</match-tag>
                </tag-match>
                <form lang="wk-AR">لر</form>
                <form lang="wk-LA">ler</form>
            </allomorph>
            <allomorph>
                <!-- This means, “this allomorph will match if the immediately preceding morpheme has all of the morphemes in the set {back}” -->
                <tag-match scope="immediately-preceding" type="all">
                    <match-tag>back</match-tag>
                </tag-match>
                <form lang="wk-AR">لار</form>
                <form lang="wk-LA">lar</form>
            </allomorph>
        </morpheme>
    </model>
</morphology>

In this case, the tags that are being matched are in the stem allomorphs; these have been specified manually in examples/01-stems.xml.

The tests in examples/all-examples.xml show that it works:

Success: The input ata (wk-LA) was accepted by the model, which is correct. [Stem]
Success: The input atalar (wk-LA) was accepted by the model, which is correct. [Stem][Plural]
Success: The input ataler (wk-LA) was rejected by the model, which is correct. 
Success: The input gözler (wk-LA) was accepted by the model, which is correct. [Stem][Plural]
Success: The input gözlar (wk-LA) was rejected by the model, which is correct. 

Options

There are lots of options with the <tag-match> element.

The scope attribute has these options:

  • immediately-preceding Match the tags on the morpheme that comes before this one in the parsing.
  • any-preceding Match the tags on any morpheme that comes before this one in the parsing.
  • immediately-following Match the tags on the morpheme that comes after this one in the parsing.
  • any-following Match the tags on any morpheme that comes after this one in the parsing. The any-* options are useful for long-distance relationships.

The type attribute has these options:

  • all Match all of the tags that I specify below with <match-tag> elements.
  • any Match any of the tags that I specify below with <match-tag> elements. The type attribute makes no difference, of course, if there is only one <match-tag> element.

One final option with tags is the <interrupted-by> element, which occurs after the <match-tag> elements. (This is relevant only to the any-* options.) The code below says, “this constraint is satisfied if any morpheme before me has a ‘back’ tag, as long as there is no intervening ‘front’ tag.”

<tag-match scope="any-preceding" type="all">
    <match-tag>back</match-tag>
    <interrupted-by>front</interrupted-by>
</tag-match>

The point of this is that you might have a vowel harmony language where you don't know that every single morpheme is going to be ‘front’ or ‘back’. With this option you can search for a ‘back’ until you get to a ‘front’.

⚠️ **GitHub.com Fallback** ⚠️