Usage instructions for OMAF Creator Viewport dependent mode - nokiatech/omaf GitHub Wiki

Usage instructions

Creating OMAF Viewport Dependent DASH streams

In all cases, a configuration json file is needed. A sample can be found in the repository.

Encoding tiled HEVC bitstreams

Pre-requisite for all the Viewport Dependent cases is that you have pre-encoded HEVC bitstreams with tiles enabled.

There are two known publically available HEVC/H.265 encoders that can create tiled bitstreams: HM and Kvazaar. Instructions to use Kvazaar to generate video streams compatible with OMAF HEVC Viewport Dependent profile are provided here. Kvazaar requires some patches on top of their latest release, so taking master source code dated 10-Oct-2018 or later and building it yourself is recommended. The relevant command line options are:
-i if the input yuv has resolution in the name, e.g. video_3840x1920.yuv, otherwise --input-res is needed too
-q or --bitrate
--gop 8 --no-open-gop --bipred to enable 8-frame B-frame pyramid with IDR frames
-p GOP length, must be N*8 if --gop 8 is given
--mv-constraint frametilemargin
--level
--tiles WxH for uniform tiles, or --tiles-width-split --tiles-height-split with parameters e.g. uW for horizontally uniform tile columns, and H1,H2 for three non-uniform tile rows (the first row starts at 0).
--set-qp-in-cu
--slices tiles
--preset
-o

Kvazaar outputs HEVC/H.265 bytestream. There exists an ffmpeg build for Kvazaar too, but it is based on older Kvazaar library. The easiest way to get H.265 streams to mp4, which is the required input format for omafvd, is to use mp4box. Simply call mp4box -add video.265 -new video.mp4. As mp4box has fps as 25 by default, the -fps parameter may be needed.

Equal resolution subpicture bitstreams on multiple qualities (OMAF Annex D.4.2)

The video.common section in the json should look like this:
"common" : {
"projection" : "equirectangular",
"output_mode" : "MultiQ"
},
The output_mode is enumeration indicating the mode in which the streams are generated.
Then you need to give two (or more) input mp4 files that contain HEVC tiled bitstreams, encoded with the same tiling scheme and other parameters, but have different quality setting (target bitrate or QP).
The input files need to be specified in the video-section. They are separated by label, which can be anything except common. The label will be visible in the segment file names. For example,
"fg": {
"filename": "video_fg.mp4",
"quality": 1 // 1...255 where 1 is the best; if not given, defaults to 1
},
"bg": {
"filename": "video_bg.mp4",
"quality": 20 // 1...255 where 1 is the best; if not given, defaults to 1
}
Then the output is defined either with "dash" or "mp4" sections.
"dash": {
"output_name_base" : "videoMultiQ", // Basename for the DASH output files
"media": {
"segment_name": {
// $Name$ expands to the value of "output_name_base" or if it is not given, to "output_mode"
// $Segment$ expands to MPD's "$Number$" or "init" depending on the segment type. Using MPD's $Number$ directly here is not allowed.
"video": "$Name$.video.$Segment$.mp4",
"audio": "$Name$.audio.$Segment$.mp4",
"extractor": "$Name$.extractor.$Segment$.mp4"
}
}
}
// "mp4" is alternative to "dash". Creates a single mp4 with all tracks (one per tile) and the extractor track, or just a single track if selected so below
"mp4": {
"filename": "videoMultiQ.mp4"
}
Default values for the segment_name are probably OK for most cases, but make sure the "output_name_base" or "filename" is what you want.

Sub-picture bitstreams with several resolutions for achieving 5K effective ERP resolution (OMAF Annex D.6.2/example 2)

The video.common section in the json should look like this:
"common" : {
"projection" : "equirectangular",
"output_mode" : "5K"
},
output_mode is enumeration indicating the mode in which the streams are generated.
Then you need to give two input mp4 files that contain HEVC tiled bitstreams, encoded with the same tiling scheme and other parameters, but having different resolutions. The higher quality stream need to have 5120x2560 resolution, and the other one 2560x1280. The input files need to be specified in the video-section, in the same way as in the equal resolution case.

Sub-picture bitstreams with several resolutions for achieving 6K effective ERP resolution (OMAF Annex D.6.3)

The video.common section in the json should look like this:
"common" : {
"projection" : "equirectangular",
"output_mode" : "6K"
},
Then you need to give four input mp4 files that contain HEVC tiled bitstreams, encoded with two different tiling schemes and different resolutions, as explained in the OMAF spec Annex D.6.3. However, cropping should be left for the OMAF processing phase. So the following input files are needed:

  • 6144x3072, encoded with non-uniform 8x3 tiles (as illustrated in Figure D.8)
  • 3072x1536, encoded with non-uniform 8x3 tiles (as illustrated in Figure D.8)
  • 3072x1536, encoded with non-uniform 4x3 tiles (as illustrated in Figure D.8)
  • 1536x768, encoded with non-uniform 4x3 tiles (as illustrated in Figure D.8) The input files are then specified in the following way (labels are only illustrative):
    "fg": {
    "filename": "video6k_fg.mp4",
    "quality": 1 // 1...255 where 1 is the best; if not given, defaults to 1
    },
    "bg": {
    "filename": "video3k_bg.mp4",
    "quality": 20 // 1...255 where 1 is the best; if not given, defaults to 1
    },
    "fg_polar": {
    "filename": "video3k_fg_polar.mp4",
    "quality": 10 // 1...255 where 1 is the best; if not given, defaults to 1
    },
    "bg_polar": {
    "filename": "video1_5k_bg_polar.mp4",
    "quality": 100 // 1...255 where 1 is the best; if not given, defaults to 1
    }

The cropping, explained in the Annex D.6.3, is preconfigured in the omafvd and applied to the videos automatically. The video resolutions and tilecounts are used as keys to identify the streams, the labels are just used for output filenames.
⚠️ **GitHub.com Fallback** ⚠️