MGM Applause Detection - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki

  1. AMP: Audiovisual Metadata Platform
  2. Documentation
  3. Archived Pages
  4. Phase 2 MGM Evaluations

MGM - Applause Detection

Category description and use cases

Workflow example:

Output standard

Summary: An array of segments, each with a start and end. Start and end are timestamps in seconds. The label may be one of "applause" or "non-applause."


Element Datatype Obligation Definition media object required Wrapper for metadata about the source media file. media.filename string required Filename of the source file. media.duration string required The duration of the source file audio. segments array required Wrapper for segments of silence, speech, or audio. segments[*] object optional A segment of silence, speech, or audio. segments[*].label string required The type of segment: applause or non-applause segments[*].start string required Start time in seconds. segments[*].end string required End time in seconds.


JSON Schema

Schema[[ ][Expand source]][]

{
    "$schema": "http://json-schema.org/schema#",
    "type": "object",
    "title": "Applause Detection Schema",
    "required": [
        "media",
        "segments"
    ],
    "properties": {
        "media": {
            "type": "object",
            "title": "Media",
            "description": "Wrapper for metadata about the source media file.",
            "required": [
                "filename",
                "duration"
            ],
            "properties": {
                "filename": {
                    "type": "string",
                    "title": "Filename",
                    "description": "Filename of the source file.",
                    "default": "",
                    "examples": [
                        "myfile.wav"
                    ]
                },
                "duration": {
                    "type": "string",
                    "title": "Duration",
                    "description": "Duration of the source file audio.",
                    "default": "",
                    "examples": [
                        "25.888"
                    ]
                }
            }
        },
        "segments": {
            "type": "array",
            "title": "Segments",
            "description": "Segments of silence, speech, or audio.",
            "items": {
                "type": "object",
                "required": [
                    "label",
                    "start",
                    "end"         
                ],
                "properties": {
                    "label": {
                      "type": "string",
                      "title": "Label",
                      "description": "The type of sound",
                      "enum": [
                          "applause",
                          "non-applause"
                      ]
                    }
                }
            }
        }
    }
}

Sample output

Sample Output[[ ][Expand source]][]

{
  "media": {
    "filename": "name.wav",
    "duration": "300"
  },
  "segments":[
    {
        "label": "non-applause",
        "start": 0.0,
        "end": 198.37
    },
    {
        "label": "applause",
        "start": 198.38,
        "end": 206.04
    }
  ]
}

Recommended tool(s)

Acoustic Classification Segmentation (custom)

Official documentation: https://github.com/lizfischer/acoustic-classification-segmentation

Language: Python

**Description: **A tensorflow implementation of speech, music, noise, silence, and applause segmentation for audio files; forked from Brandeis Lab for Linguistics & Computation.

Cost: Free (open source)

Installation & requirements

Clone repository (link above) & use `pip install -r requirements.txt`

Requires ffmpeg, and Python 3 with the following libraries:

librosa==0.7.2
numpy==1.17.4
numpydoc==0.9.2
scipy==1.4.1
scikit-learn==0.22.1
ffmpeg-python==0.2.0
tensorflow>=2.0.1

Parameters

[Input formats]

mp3, wav, or mp4

Example Usage

Note: This tools runs over all mp3, mp4, or wav files in the input directory, it does not take a single file input.

<tool name> Example

python run.py -s pretrained/applause-binary-20210203 /path/to/media/folder -o /path/to/output/folder -T 1000 -b

Example Output

<tool name> Output

[
    {
        "label": "non-applause",
        "start": 0.0,
        "end": 0.64
    },
    {
        "label": "applause",
        "start": 0.65,
        "end": 6.78
    },
    {
        "label": "non-applause",
        "start": 6.79,
        "end": 373.83
    },
    {
        "label": "applause",
        "start": 373.84,
        "end": 379.55
    },
    {
        "label": "non-applause",
        "start": 379.56,
        "end": 384.52
    },
    {
        "label": "applause",
        "start": 384.53,
        "end": 390.34
    },
    {
        "label": "non-applause",
        "start": 390.35,
        "end": 430.69
    },
    {
        "label": "applause",
        "start": 430.7,
        "end": 433.98
    },
    {
        "label": "non-applause",
        "start": 433.99,
        "end": 963.03
    },
    {
        "label": "applause",
        "start": 963.04,
        "end": 982.04
    },
    {
        "label": "non-applause",
        "start": 982.05,
        "end": 1388.61
    },
    {
        "label": "applause",
        "start": 1388.62,
        "end": 1398.6
    },
    {
        "label": "non-applause",
        "start": 1398.61,
        "end": 1799.13
    },
    {
        "label": "applause",
        "start": 1799.14,
        "end": 1807.36
    },
    {
        "label": "non-applause",
        "start": 1807.37,
        "end": 1857.13
    },
    {
        "label": "applause",
        "start": 1857.14,
        "end": 1864.86
    },
    {
        "label": "non-applause",
        "start": 1864.87,
        "end": 1901.45
    }
]

Other evaluated tools

Yamnet

Official documentation: <link>

Language: 

**Description: **

Cost: <$ OR Free (open source)>

Social impact: 

Notes: 

Installation & requirements

Parameters

Input formats

Example Usage

<tool name> Example

Example Output

<tool name> Output

Evaluation summary

Attachments:

segmentation-workflow.png (image/png)\

Document generated by Confluence on Feb 25, 2025 10:39

⚠️ **GitHub.com Fallback** ⚠️