MGM Shot Detection - AudiovisualMetadataPlatform/amp_documentation GitHub Wiki
- Category description and use cases
- Output standard
- Recommended tool(s)
- Other evaluated tools
- Evaluation summary
Summary:
Element Datatype Obligation Definition media object required Wrapper for metadata about the source media file. media.filename string required Filename of the source file. media.duration string required The duration of the source file.. shots array required The list of shots in the video. shots[*] object optional A shot in a video. shots[*].type string required The type of shot, "scene" or "shot". shots[*].start string required The start time in seconds (s.fff). shots[*].end string required The end time in seconds (s.fff).
Schema[[ ][Expand source]][]
{
"$schema": "http://json-schema.org/schema#",
"type": "object",
"title": "Shot Detection Schema",
"required": [
"media",
"shots"
],
"properties": {
"media": {
"type": "object",
"title": "Media",
"description": "Wrapper for metadata about the source media file.",
"required": [
"filename",
"duration"
],
"properties": {
"filename": {
"type": "string",
"title": "Filename",
"description": "Filename of the source file.",
"default": "",
"examples": [
"myfile.wav"
]
},
"duration": {
"type": "string",
"title": "Duration",
"description": "Duration of the source file.",
"default": "",
"examples": [
"25.888"
]
}
}
},
"shots": {
"type": "array",
"title": "Shots",
"description": "The shots and/or scenes in a video.",
"items": {
"type": "object",
"required": [
"type",
"start",
"end"
],
"additionalProperties": false,
"properties": {
"type": {
"type": "string",
"title": "Type",
"description": "The type of shot, 'scene' or 'shot'.",
"enum": [
"scene",
"shot"
]
},
"start": {
"type": "string",
"title": "Start.",
"description": "Start time in seconds",
"default": 0.0,
"examples": [
"123.45"
]
},
"end": {
"type": "string",
"title": "End",
"description": "End time in seconds",
"default": 0.0,
"examples": [
"123.45"
]
}
}
}
}
}
}
Sample Output[[ ][Expand source]][]
{
"media": {
"filename": "myvideo.mp4",
"duration": "45.35"
},
"shots": [
{
"type": "scene",
"start": "0.0",
"end": "45.35"
},
{
"type": "shot",
"start": "0.0",
"end": "10.89"
},
{
"type": "shot",
"start": "10.89",
"end": "19.4"
},
{
"type": "shot",
"start": "19.4",
"end": "45.35"
}
]
}
Official documentation: <link>
Language:
**Description: **
Cost: <$ OR Free (open source)>
Social impact:
Notes:
[Input formats]
<tool name> Example
<tool name> Output
Official documentation: https://pyscenedetect.readthedocs.io/
Language: Python
Description: PySceneDetect [is an open-source command-line application and Python library for ]detecting scene changes in videos.
Cost: Free (open source)
Social impact:
Notes: See also this Colab notebook.
Can be installed via pip: `pip install scenedetect`
Input formats: mp4, likely others?
Can be used as a python library (which we don't have an example of) or run from the command line and the output reshaped to our JSON format:
pyscenedetect Command Line Example
scenedetect --input "video.mp4" --output output/path detect-content
[Example Output]
Default output is a CSV, where the first row is a list of scene ending timecodes, and the second row begins a table with timecode information (see below). There is code in the pyscenedetect Colab notebook for reshaping this output to our standard JSON.
Timecode List: | 0:00:11 | 0:00:19 | 0:00:26 | 0:00:28 | 0:00:30 | 0:00:35 | 0:00:36 | 0:00:37 |
|
Scene Number | Start Frame | Start Timecode | Start Time (seconds) | End Frame | End Timecode | End Time (seconds) | Length (frames) | Length (timecode) | Length (seconds) |
1 | 0 | 0:00:00 | 0 | 332 | 0:00:11 | 11.067 | 332 | 0:00:11 | 11.067 |
2 | 332 | 0:00:11 | 11.067 | 577 | 0:00:19 | 19.233 | 245 | 0:00:08 | 8.167 |
3 | 577 | 0:00:19 | 19.233 | 770 | 0:00:26 | 25.667 | 193 | 0:00:06 | 6.433 |
4 | 770 | 0:00:26 | 25.667 | 844 | 0:00:28 | 28.133 | 74 | 0:00:02 | 2.467 |
5 | 844 | 0:00:28 | 28.133 | 914 | 0:00:30 | 30.467 | 70 | 0:00:02 | 2.333 |
6 | 914 | 0:00:30 | 30.467 | 1045 | 0:00:35 | 34.833 | 131 | 0:00:04 | 4.367 |
7 | 1045 | 0:00:35 | 34.833 | 1084 | 0:00:36 | 36.133 | 39 | 0:00:01 | 1.3 |
8 | 1084 | 0:00:36 | 36.133 | 1114 | 0:00:37 | 37.133 | 30 | 0:00:01 | 1 |
Official documentation: Google Cloud Video Intelligence Documentation
Language:
**Description: **
Cost: <$ OR Free (open source)>
Social impact:
Notes:
<tool name> Example
<tool name> Output
segmentation-workflow.png
(image/png)
shot detection
wf diagram.png (image/png)\
Document generated by Confluence on Feb 25, 2025 10:39