MGM Applause Detection - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- Category description and use cases
- Output standard
- Recommended tool(s)
- Other evaluated tools
- Evaluation summary
Summary: An array of segments, each with a start and end. Start and end are timestamps in seconds. The label may be one of "applause" or "non-applause."
Element Datatype Obligation Definition media object required Wrapper for metadata about the source media file. media.filename string required Filename of the source file. media.duration string required The duration of the source file audio. segments array required Wrapper for segments of silence, speech, or audio. segments[*] object optional A segment of silence, speech, or audio. segments[*].label string required The type of segment: applause or non-applause segments[*].start string required Start time in seconds. segments[*].end string required End time in seconds.
Schema[[ ][Expand source]][]
{
"$schema": "http://json-schema.org/schema#",
"type": "object",
"title": "Applause Detection Schema",
"required": [
"media",
"segments"
],
"properties": {
"media": {
"type": "object",
"title": "Media",
"description": "Wrapper for metadata about the source media file.",
"required": [
"filename",
"duration"
],
"properties": {
"filename": {
"type": "string",
"title": "Filename",
"description": "Filename of the source file.",
"default": "",
"examples": [
"myfile.wav"
]
},
"duration": {
"type": "string",
"title": "Duration",
"description": "Duration of the source file audio.",
"default": "",
"examples": [
"25.888"
]
}
}
},
"segments": {
"type": "array",
"title": "Segments",
"description": "Segments of silence, speech, or audio.",
"items": {
"type": "object",
"required": [
"label",
"start",
"end"
],
"properties": {
"label": {
"type": "string",
"title": "Label",
"description": "The type of sound",
"enum": [
"applause",
"non-applause"
]
}
}
}
}
}
}
Sample Output[[ ][Expand source]][]
{
"media": {
"filename": "name.wav",
"duration": "300"
},
"segments":[
{
"label": "non-applause",
"start": 0.0,
"end": 198.37
},
{
"label": "applause",
"start": 198.38,
"end": 206.04
}
]
}
Official documentation: https://github.com/lizfischer/acoustic-classification-segmentation
Language: Python
**Description: **A tensorflow implementation of speech, music, noise, silence, and applause segmentation for audio files; forked from Brandeis Lab for Linguistics & Computation.
Cost: Free (open source)
Clone repository (link above) & use `pip install -r requirements.txt`
Requires ffmpeg, and Python 3 with the following libraries:
librosa==0.7.2
numpy==1.17.4
numpydoc==0.9.2
scipy==1.4.1
scikit-learn==0.22.1
ffmpeg-python==0.2.0
tensorflow>=2.0.1
[Input formats]
mp3, wav, or mp4
Note: This tools runs over all mp3, mp4, or wav files in the input directory, it does not take a single file input.
<tool name> Example
python run.py -s pretrained/applause-binary-20210203 /path/to/media/folder -o /path/to/output/folder -T 1000 -b
<tool name> Output
[
{
"label": "non-applause",
"start": 0.0,
"end": 0.64
},
{
"label": "applause",
"start": 0.65,
"end": 6.78
},
{
"label": "non-applause",
"start": 6.79,
"end": 373.83
},
{
"label": "applause",
"start": 373.84,
"end": 379.55
},
{
"label": "non-applause",
"start": 379.56,
"end": 384.52
},
{
"label": "applause",
"start": 384.53,
"end": 390.34
},
{
"label": "non-applause",
"start": 390.35,
"end": 430.69
},
{
"label": "applause",
"start": 430.7,
"end": 433.98
},
{
"label": "non-applause",
"start": 433.99,
"end": 963.03
},
{
"label": "applause",
"start": 963.04,
"end": 982.04
},
{
"label": "non-applause",
"start": 982.05,
"end": 1388.61
},
{
"label": "applause",
"start": 1388.62,
"end": 1398.6
},
{
"label": "non-applause",
"start": 1398.61,
"end": 1799.13
},
{
"label": "applause",
"start": 1799.14,
"end": 1807.36
},
{
"label": "non-applause",
"start": 1807.37,
"end": 1857.13
},
{
"label": "applause",
"start": 1857.14,
"end": 1864.86
},
{
"label": "non-applause",
"start": 1864.87,
"end": 1901.45
}
]
Official documentation: <link>
Language:
**Description: **
Cost: <$ OR Free (open source)>
Social impact:
Notes:
<tool name> Example
<tool name> Output
segmentation-workflow.png
(image/png)\
Document generated by Confluence on Feb 25, 2025 10:39