Forced Alignment - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- AMP: Audiovisual Metadata Platform
- Documentation
- For Collection Managers
- MGMs (Metadata Generation Mechanisms)
Forced Alignment
Forced Alignment is the process that, based on audio and a text transcript of the audio, adds timestamps to the transcript.\
Inputs
- Audio (mp3, wav, possibly other formats)
- Corresponding transcript in a text format with no time codes.
Output Formats
- Gentle Transcript (json) - Transcript with time codes in the Gentle delivered json format.
- AMP Transcript Aligned (json) - Aligned transcript in the AMP JSON format.
MGMs in AMP
Gentle Forced Alignment
The Gentle Forced Alignment MGM takes an AMP transcript as input and the audio file related to the item to generate the AMP transcript output with updated time codes.
Parameters:
- Audio (mp3, wav, possibly other formats) and transcript (plain text).
Notes on Use
- In AMP, this tool was created to correct time codes of a transcript that went through the Human MGM for correction because the BBC transcript editor used in the correction process results in corrected speech with wrong time codes.\
Use Cases and Example Workflows
Realigning a transcript with misaligned time codes
An item had the transcript corrected by a human using the BBC transcript editor. During this correction process, the editor had to add several chunks of speech, which the BBC editor did not align with time codes. The CM wants the resulting transcript to go through Forced Alignment to correct the problems.
AMP JSON Output
Schema
Sample output
Sample Output
Attachments:
Forced
alignment workflow.png
(image/png)\
Document generated by Confluence on Feb 25, 2025 10:39