INA Speech Segmenter - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- About
- The InaSpeechSegmenter is a python based tool that takes a binary file as input and products a list of audio segments. Each segment contains a start, end, and label.
- Source Code
- galaxy/tools/ina_speech_segmenter/ina_speech_segmenter.xml
Tool configuration detailing tool execution, input file, output file, and labeling. - galaxy/tools/ina_speech_segmenter/ina_speech_segmenter_wrapper.py
Python script to call speech segmenter via API and conform json to schema - galaxy/tools/ina_speech_segmenter/segmentation_schema.py
Set of classes representing the segmentation schema\
- galaxy/tools/ina_speech_segmenter/ina_speech_segmenter.xml
- Dependencies
- The python script uses the inaSpeechSegmenter tool. The source code, dependencies, and documentation can be found here: https://github.com/ina-foss/inaSpeechSegmenter.\
- Installation:
$ pip install tensorflow-gpu # if you wish GPU implementation (recommended if your host has a GPU)
$ pip install tensorflow # for a CPU implementation
# install framework and dependencies
$ pip install inaSpeechSegmenter\ - Running the tool
- The tool can be invoked from Galaxy UI as other tools. User
needs to use Get Data / Upload from computer tool to ingest the
input file into Galaxy before running the tool.
When ingesting, choose binary (the default) as file format. The file then will be copied into a designated location in Galaxy file system.\
- The tool can be invoked from Galaxy UI as other tools. User
needs to use Get Data / Upload from computer tool to ingest the
input file into Galaxy before running the tool.
- Parameters
- $input_file: the audio file to run the segmentation on.
- $json_file: the output json file.
- Output
- Json file conforming to schema located here https://wiki.dlib.indiana.edu/display/AMP/MGM---Segmentation
Document generated by Confluence on Feb 25, 2025 10:39