Replicating results from paper - IDLabMedia/open-dibr GitHub Wiki

A paper on this software tool is currently being written/evaluated. The paper summarizes the results for multiple experiments that measure the computational performance and visual quality. This page elaborates on how to reproduce the results. For the paper, the results were generated from a desktop with a i9-9900K, 8 core CPU and a RTX 2080 Ti GPU, on Windows.

The paper contains two tests:

  1. Qualitative comparison to TMIV and NeRF
  2. Computational performance measurements

Qualitative comparison to TMIV and NeRF

For six datasets (Fan, Frog, Kitchen, Painter, Barbershop_c and Zen_Garden_c), OpenDIBR, NeRF and TMIV were tested in order to compare their visual quality. In practice, the cameras for each dataset were divided in two groups: the 'input cameras' that are used as input for each system, and the 'output cameras' for which they all generated an image. In this way, ground truth images were present to calculate objective quality metrics, e.g. PNSR.

The image below shows which camera was used as input (blue) or output (orange). The choice was made to assign cameras as inputs as much as possible, since the quality of NeRF highly depends on the inputs provided.

PaperCamSetups.png

Download and instructions

Important: the OpenDIBR implementation has changed slightly to support not just the OMAF axial system, but also that of COLMAP and OpenGL. So if you download a .json file from the links in this page, be sure to define the "Axial_system": "OMAF", for example like this.

The light field datasets, JSON files, command line files and instructions can be downloaded from here under folder visual_quality. Additionally, the images generated by OpenDIBR, NeRF and TMIV used to create Tables 2 and 3 in the paper can be downloaded there.

To run NeRF (Instant Neural Graphics Primitives): follow their build instructions to create an executable and check if it runs as expected. This NeRF implementation does not accept videos, only images. For our paper, we extracted one frame of each input video as a PNG file and fed these images to NeRF.

To run TMIV: follow their build instructions to create an executable and check if it runs as expected. This implementation accepts raw YCbCr files instead of compressed videos.

Computational performance measurements

OpenDIBR was run on seven different datasets (Classroom, Fan, Kitchen, Frog, Painter, Zen_Garden_s and Barbershop_s) to create Tables 4 and 5 in the paper. Each time, the number of input views was one of [1,2,4,8].

Download and instructions

The light field datasets, JSON files, command line files and instructions can be downloaded from here under folder computational_performance. See the note above about adding "Axial_system": "OMAF" to the .json files.

Difficulties measuring the framerate of OpenDIBR

OpenDIBR measures the time each frame takes to render. When adding --fps_csv your_file.csv to the command line, these times will be written to the provided CSV file in milliseconds. For example:

RealtimeDIBR -i ../../examples/Fan -j ../../examples/Fan/example_opengl.json --fps_csv milliseconds_per_frame.csv

Determining the computation performance of an application like this is difficult. Please read the rest of this page with care.

Consider four cases: Case 1. Image (PNG) dataset or --static is used, without --vr

This is the simplest case. Each frame is rendered as fast as possible, there are no synchronization limits.

Case 2. Image (PNG) dataset or --static is used, with --vr

SteamVR imposes Vsync with a certain refresh rate, which is often fixed, e.g. 90Hz for the HTC Vive (Pro). This means that even if frames could in theory be rendered faster than 90Hz, OpenVR will run a waiting loop to limit the frame rate to 90Hz. If the theoretical framerate is slower than 90Hz, this waiting loop will limit the actual frame rate to 45Hz, or in really slow cases, to 30Hz, 22.5Hz, etc.

Note that if SteamVR sets the Vsync to a refresh rate of 90Hz, any framerate lower than 90Hz basically means that the application should not be used at all. This is because the head position will already be out of date when the frame will be shown on the display.

Case 3. Video (MP4) dataset without --static and without --vr

The videos have a certain framerate, which is assumed to be 30 frames per second (fps). This is separate from the framerate of OpenDIBR itself, which can generate and display frames way faster. However, OpenDIBR does need to make sure that each 33.333 milliseconds, it uses the next frames from the input videos to render the next few output frames.

This means that OpenDIBR should run at a multiple of 30 fps, since each frame should take roughly the same amount of time. The frame duration reported in the CSV will therefor also be (close to) multiples of 30.

Case 4. Video (MP4) dataset without --static and with --vr

This case combines the previous two situations. In short, OpenDIBR either manages to achieve a frame rate of 90Hz or does not.