Azure Video OCR - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki

  1. AMP: Audiovisual Metadata Platform
  2. Documentation
  3. For Developers
  4. MGM Adapters

Azure Video OCR

  • [About ]
    • [Azure Video OCR utilizes the results produced by Azure Video Indexer, including the main indexer file and the video OCR artifacts, and generates the standard AMP Video OCR JSON.]
    • [It has been added as a tool on AMP's Galaxy and performs video OCR on the input videos.]
    • [The output produced by this tool is a JSON consisting of the text and the corresponding bounding box information on each frame in the input.]
    • [ If "dedupe" option is checked, will also generate AMP OCRR JSON with duplicate frames removed, i.e. consecutive frames with same texts within the specified period.]
  • [Source Code]
    • [galaxy/tools/amp_vocr/azure_video_ocr.xml : This is the configuration file that details the tools usage, its inputs, outputs, version, and other things.]
    • [galaxy/tools/amp_vocr/azure_video_ocr.py : This is a python wrapper to generate AMP Video OCR JSON based on Azure Video indexer JSON and Video OCR Artifacts JSON output from Azure Video Indexer.]
    • [galaxy/tools/amp_schema/video_ocr.py : Classes used to construct AMP Video OCR json output.]
  • [Running ]
    • [The tool can be invoked from Galaxy UI as other tools, it needs be used as the next MGM taking outputs from Azure Video Indexer with the include_ocr flag set to true. ]
  • [Parameters]
    • input_video: The same input video file used by Azure Video Indexer
    • azure_video_index: Azure Video Index JSON output from Azure Video Indexer
    • azure_artifact_ocr: Azure Artifact OCR JSON output from Azure Video Indexer
    • dedupe: Whether to dedupe consecutive frames with same texts
    • period: Period in seconds to last as consecutive duplicate frames
  • [Outputs][
    ]
    • [amp_vocr: The standardized AMP Video OCR JSON]
    • [amp_vocr_dedupe: The AMP OCRR JSON with duplicate frames removed]

Document generated by Confluence on Feb 25, 2025 10:39

⚠️ **GitHub.com Fallback** ⚠️