MGM Forced Alignment - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- Category description and use cases
- Output standard
- Recommended tool(s)
- Other evaluated tools
- Evaluation summary
Speech-to-Text > Transcript Editor > Forced Aligner
Summary:
Schema[[ ][Expand source]][]
{
}
Sample Output[[ ][Expand source]][]
{
}
Official documentation: Gentle on Github
Language: REST API or Python on command line
**Description: **
Cost: Free (open source)
Social impact:
Notes:
Two options for installation:
- Install Docker image to run webserver, then use API
- Download source code and run bash installation script, then use as a command line python program
[Input formats]
Audio (mp3, wav, possibly other formats) and transcript (plain text).
<tool name> Example
curl -F "[email protected]" -F "[email protected]" "http://localhost:8765/transcriptions?async=false"# ORpython3 align.py audio.mp3 words.txt
Gentle Output
{
"transcript": "Now, let me looking at the Congress, uh, as one of the institutions in trouble, uh, to some degree, not the same degree as others, perhaps, but still part of the whole mail.",
"words": [
{
"alignedWord": "now",
"case": "success",
"end": 38.29,
"endOffset": 3,
"phones": [
{
"duration": 0.12,
"phone": "n_B"
},
{
"duration": 0.01,
"phone": "aw_E"
}
],
"start": 38.16,
"startOffset": 0,
"word": "Now"
},
{
"alignedWord": "let",
"case": "success",
"end": 38.65,
"endOffset": 8,
"phones": [
{
"duration": 0.05,
"phone": "l_B"
},
...
]
}
Official documentation: <link>
Language:
**Description: **
Cost: <$ OR Free (open source)>
Social impact:
Notes:
<tool name> Example
<tool name> Output
segmentation-workflow.png
(image/png)\
Document generated by Confluence on Feb 25, 2025 10:39