Transcript Editor HMGM - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- AMP: Audiovisual Metadata Platform
- Documentation
- For Collection Managers
- MGMs (Metadata Generation Mechanisms)
Transcript Editor HMGM
The transcript editor is a web-based tool for correcting speech-to-text generated transcripts.
Inputs
- Video or audio file
- amp_transcript file (or one of its variations: amp_transcript_adjusted, amp_transcript_aligned)
Output Formats
- DraftJS_Uncorrected -- the transcript is converted to this format so that the BBC Transcript Editor can work with it.\
- DraftJS_Corrected -- the BBC Transcript Editor delivers the corrected script in the DraftJS format, which AMP converts to the AMP JSON transcript format.
- amp_transcript_corrected -- the AMP JSON transcript with corrections made by the user.
MGMs in AMP
BBC React Transcript Editor
BBC React Transcript editor that enables users to correct automated transcriptions of audio and video files.
Notes on Use
- User creates a workflow in Galaxy including HMGMs as some of the steps.
- User invokes the workflow from AMPPD UI.
- AMPPD checks the workflow to see if any HMGM is involved, and if so, generates context information for each HMGM as a parameter, then send request to Galaxy to run the workflow.
- Galaxy executes the workflow, and when it hits an HMGM step, it invokes the HMGM tool
- The HMGM tool then creates a task in the task management tool specified for this unit.
- The task URL is accessible to authorized users. The page includes specification of the task such as task type and description, which in turn includes various information passed down in the task context.
- The task assignee can then click the above editor URL, which will open up the AMPPD UI page with the editor embedded. If the task assignee has already authenticated into AMP, they should see the editor immediately. If not, they may be asked to enter an "Editor Password", which can be found in the task details.
- In the case of transcript correction, the BBC transcript editor will be presented with the loaded media file and transcript.
- The assignee can use the editor to edit the transcript and play the media as a reference. If AMP logs the assignee out in the background (ex: the editor is left open in the browser for a long time), it will pop up with a request to enter the "Editor Password", which can be found in the task details in the task management tool.
- Upon completion, the assignee can click the "complete" button in the transcript editor, and the editor will save the changes and AMP will pass the control back to the HMGM tool.
- HMGM job then completes and the workflow continues into the next step. When the HMGM job completes, it will "close" the task in the task management tool automatically.
Use Cases and Example Workflows
Use Case: Accessibility
A collection manager wants to create video captions for interview tapes in his collection for accessibility purposes. To efficiently correct the transcript, the collection manager could generate a transcript, then review it as the video plays to ensure it is correct. They send each video through AMP's transcript generation tool, then they are able to use the BBC React Transcript Editor to follow along with each video and make the nessisary corrections to the transcripts for more accurate captions.
AMP JSON Output
See the output for Speech-to-Text.
Attachments:
Transcript_HMGM.png (image/png)
TRANSCRIPT_NER_WORKFLOW.png
(image/png)
TranscriptEditorHMGMWF.png
(image/png)\
Document generated by Confluence on Feb 25, 2025 10:39