Supplemental Files in Workflows - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki

Supplemental Files in Workflows

The Audiovisual Metadata Platform (AMP) facilitates the generation of metadata to support discovery and use of digitized and born-digital audio and moving image collections. While processing A/V content, however, one may have supplementary information about an A/V file to be used in conjunction with the its content to reach the goal.

A number of other types of files -- called in AMP Supplemental files (SFs) -- are currently handled by AMP (or are in the AMP roadmap for inclusion) to support specific needs: Facial Recognition, Vocabulary Tagging, and Timecoding Transcripts.

Facial Recognition

AMP currently uses an open-source Facial Recognition code that takes as training input a set of files with images of the face to be recognized. Once trained on those photos, the tool processes the video content to find instances of the face in that content.

In AMP, the way to provide the images for training is gathering the image files and "zipping" them together, providing AMP with a supplemental file in .zip format. The files should be in a folder with the name of the person whose photo are gathered in that folder. When using photos from multiple people, each person should have a folder named after them with their photos in that folder. The zip file should include all folders and should [not] have a parent folder for the people folders. For example, if you have photos of Abraham Lincoln and Queen Victoria and want them in the same file, you create two folders, one named Abraham Lincoln, which should contain Lincoln's photos to be used in training, and another folder named Queen Victoria, which should contain the queen's photos to be used in training. These 2 folder should be the only folders in the zip file, both at the root level of the zip file.

Example of workflow using Facial Recognition (with a contact sheet MGM for a user-friendly output):

Vocabulary Tagging

Two use cases prompted the project team to program a tool that would take as input a transcript (often the output of a speech-to-text tool such as Kaldi or AWS Transcribe) and a list of terms to search in the transcript, notifying the user of instances in the transcript of occurrences of each of the terms on the list. This addresses two use cases:

Identifying harmful language that one does not want to mask in the text, but rather warn the user about its presence.
Identifying content where certain entity names or specific words are used.

The file should be in the .txt format and should contain one word to be identified per line.

Example of Workflow using Vocabulary Tagging:

Timecoding Transcripts

[Note: This particular use case is not yet fully supported in AMP 1.0.0 release, as the Gentle forced alignment does not recognize that the txt supplement is in the appropriate format.]

It may be that an Item in your collection already has a good transcript, but the transcript does not contain time codes. To synchronize the transcript with the A/V content during playback, the transcript will need the time codes. AMP offers an MGM that uses the Gentle Forced Alignment tool that takes as input the audio file and the non-timecoded transcript to align both via generating a timecoded transcript as output.

Example of Workflow using Timecoding of Transcripts using Gentle Forced Alignment:

How to provide Supplemental Files to AMP

Supplemental files can be added to AMP both via the Batch Ingest process or by using the UI provided for this purpose (currently in development, but mocks here, here and here).

When uploading a Supplemental File to AMP, one needs to specify which category of file it is: is it training photos for Facial Recognition? A list of terms for Vocabulary Tagging? A non-timecoded transcript to go through forced alignment and timecode generation?

Also, the user has the opportunity to tell AMP whether that particular Supplemental file is to be made available for a specific Content file, or for all content files within a specific Item, or for all content files within a given Collection, or for all content files within a certain Unit. AMP also allows multiple Supplemental files for the same category; when that is the case, the user will need to tell AMP which one to use when running the A/V file through a workflow that takes a Supplemental file of that category.

A SF needs to be first uploaded to AMP before it can be used in a WF.

How to use Supplemental Files in AMP

If a user wants to run a workflow (WF) that requires a Supplemental File as input, AMP will do its best to figure out what supplemental file to use. For example, if the WF includes a step to do Facial Recognition (FR), to process that WF AMP will need the Supplemental File that includes the images to use while training the FR tool. If AMP finds one SF of category FR available to this content file (because available somewhere in the file hierarchy - PF, Item, Collection, or Unit), AMP will use that file.

If, however, AMP finds multiple SFs of that category available to the Content File being submitted to the WF, AMP will ask the user which one to use. When no SFs are available, AMP will not submit the PF to the WF and will report the failure to the user.

Suppose the user is submitting the Content File Lunchroom Manners to the WF Test Facial Recognition. The submission page will look like this:

When the user clicks on Submit to workflow, AMP will determine that the WF selected requires a SF of category Face Recognition and will see if there is one such SF available for Lunchroom Manners already uploaded. If there is one and only one, AMP will submit the PF to the WF with that SF as input. If, however, there are multiple possible SFs, AMP will prompt the user to pick one of the possible options:

Note that this modal page has a toggle on the bottom. This is to help when multiple PFs are submitted to the same WF and AMP should use the same selection for all of them when it is a valid selection for the PF; when that SF is not a valid choice, AMP will follow the same logic described above to determine which SF to use.

Adding a Supplemental node to a Workflow

The Supplemental File node is found under the Get Data heading in the Workflow Editor:

Once you add that node to the workflow, you specify its attributes on the right-hand panel. For instance, for the Transcript, this is how it looks like:

Attachments:

image2022-7-15_15-52-46.png (image/png)
image2022-7-15_15-56-47.png (image/png)
image2022-7-15_15-57-32.png (image/png)
image2023-1-17_16-9-32.png (image/png)
image2023-1-17_16-11-35.png (image/png)
image2023-2-28_10-21-25.png (image/png)
image2023-2-28_10-24-15.png (image/png)
image2023-2-28_10-32-12.png (image/png)\

Document generated by Confluence on Feb 25, 2025 10:39