Gentle Forced Alignment - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- About
- The Forced Alignment tool is a wrapper around the python tool and API Gentle
- Source Code
- AMP's fork of the Gentle
repo: https://github.com/AudiovisualMetadataPlatform/gentle
- Only a slight change was made to the Kaldi installation bash script.
- Singularity container to build the Gentle
tool: https://github.com/AudiovisualMetadataPlatform/gentle-singularity
- Singularity wrapper around gentle removes the numerous dependencies in the build process. It also removes the need to have available port open for the API.
- /srv/amp/gentle-singularity/gentle-singularity.sif: Singularity container file for the forced alignment code with all dependencies needed built-in
- galaxy/tools/amp_stt/gentle_forced_alignment.py: Python wrapper script to run forced alignment in the singularity container\
- AMP's fork of the Gentle
repo: https://github.com/AudiovisualMetadataPlatform/gentle
- Dependencies
- All dependencies are included in the singularity sif file, no extra installation needed.\
- Usage: See details on how to install, build, and run @https://github.com/AudiovisualMetadataPlatform/gentle \
- Parameters
- $input_audio_file: Input audio file in wav format
- $input_transcript_file: Input transcript file in the form of AMP STT Json
- Output
- amp_transcript: JSON file in AMP Transcript format
- Notes
-
In some instances, words in the input transcript could not be found. It produces a json node like this:
| ##### { "case": "not-found-in | | -audio", "endOffset": 60941, "startOffset": 60937, "w | | ord": "type" } {#GentleForcedAlignment-{"case":"not-found-in-a | | udio","endOffset":60941,"startOffset":60937,"word":"type"}} |
:::
-
To accommodate for this, with input from the MGM team, we implemented an algorithm which checks to see how far in the transcript the next "successful" match was
- We take the add the average time ((Next Success Start - Last Success End)/# of words away) to the previous words to get our new start/end time for the unfound words.
-
Document generated by Confluence on Feb 25, 2025 10:39