MGM Forced Alignment - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki

  1. AMP: Audiovisual Metadata Platform
  2. Documentation
  3. Archived Pages
  4. Phase 2 MGM Evaluations

MGM - Forced Alignment

Category description and use cases

Workflow example:

Speech-to-Text > Transcript Editor > Forced Aligner

Output standard

Summary: 

JSON Schema

Schema[[ ][Expand source]][]

{
 
}

Sample output

Sample Output[[ ][Expand source]][]

{

}

Recommended tool(s)

Gentle

Official documentation: Gentle on Github

Language: REST API or Python on command line

**Description: **

Cost:  Free (open source)

Social impact: 

Notes: 

Installation & requirements

Two options for installation:

  1. Install Docker image to run webserver, then use API
  2. Download source code and run bash installation script, then use as a command line python program

Parameters

[Input formats]

Audio (mp3, wav, possibly other formats) and transcript (plain text).

Example Usage

<tool name> Example

curl -F "[email protected]" -F "[email protected]" "http://localhost:8765/transcriptions?async=false"# ORpython3 align.py audio.mp3 words.txt

Example Output

Gentle Output

{
"transcript": "Now, let me looking at the Congress, uh, as one of the institutions in trouble, uh, to some degree, not the same degree as others, perhaps, but still part of the whole mail.",
"words": [
    {
        "alignedWord": "now",
        "case": "success",
        "end": 38.29,
        "endOffset": 3,
        "phones": [
            {
                "duration": 0.12,
                "phone": "n_B"
            },
            {
                "duration": 0.01,
                "phone": "aw_E"
            }
        ],
        "start": 38.16,
        "startOffset": 0,
        "word": "Now"
    },
    {
        "alignedWord": "let",
        "case": "success",
        "end": 38.65,
        "endOffset": 8,
        "phones": [
            {
                "duration": 0.05,
                "phone": "l_B"
            },

    ...

]
}

Other evaluated tools

Tool Name

Official documentation: <link>

Language: 

**Description: **

Cost: <$ OR Free (open source)>

Social impact: 

Notes: 

Installation & requirements

Parameters

Input formats

Example Usage

<tool name> Example

Example Output

<tool name> Output

Evaluation summary

Attachments:

segmentation-workflow.png (image/png)\

Document generated by Confluence on Feb 25, 2025 10:39

⚠️ **GitHub.com Fallback** ⚠️