Lider Depth Estimation Reference Software - LambLabs/Lider-DERS GitHub Wiki

Throughout the years, MPEG has been developing Depth Estimation Reference Software (DERS) which has constantly been improved in order to provide state-of-the-are depth estimation results.

Currently, DERS can estimate depth maps based on arbitrary arranged input views. Depending on the configuration settings, 2 or 3 input view can be used (e.g. left, center, right). The depth is estimated with a graph cuts algorithm which find optimal correspondences between the views on pixel-by-pixel basis. Therefore, the resolution of the generated depth is the same as the resolution of the input images. The output format of the depth maps in 4:2:0 YUV or 4:0:0 YUV (no chrominance) file where the Y component contains normalized disparity of the center view. Additionally, DERS outputs two scaling values: zNear and zFar, which have been used to normalize the disparity. More details can be found in software manual [m34302].

List of the implemented tools:

  • Number of input views for depth estimation
    • 2 – stereo case [m31518]
    • 3 – three view case
    • N - any number of views [m46126]
  • Number of depth maps estimated simlutanously
    • 1 - one depth map
    • M - any numer of depth maps less than number of input views M<=N [m46126]
  • Search Direction:
    • Horizontal disparity only search – requires a rectified set of input views
    • Homography based search
    • Epipolar line search – supports arbitrary view arrangement [m31518]
  • Search depth range specification
    • Disparity (pixel) based – with the use of minimal and maximal disparity value [m15377]
    • Z-distance based [m32249] – with use of Z-near and Z-far values
  • Search precision [m15836]:
    • Pixel (Pel) – disparity is estimated with precision up to a distance between neighboring pixels
    • Half pixel (Hpel) – disparity is estimated with precision up to half of a distance between neighboring pixels
    • Quarter pixel (Qpel) – disparity is estimated with precision up to quarter of a distance between neighboring pixels
  • Vertical up-sampling [m31518]
  • Pixel/Block similarity metrics
    • Pixel luminance matching
    • 3x3 fast block matching [m15837], [m16390], [m16092]
    • Soft segmentation block matching [m17049], [m16923]
  • Depth Bit supported
    • 8 bit per depth sample
    • 16 bit per depth sample [m31518]
  • Segmentation enhancement [m16092], [m16390]
    • Mean-shift algorithm
    • Pyramid segmentation
    • K-means clustering
  • Time-consistency enhancement [m16070], [m16048], [m15594]
  • Semi-automatic depth estimation [m16923], [m16605], [m16411], [m16391], – manual hints for improved depth estimation
    • Edge map – marks edges in the input views, where the generated depth do not need to be continuous
    • Manual disparity map – allows manual specification of output in designated regions
    • Static area map – marks regions which do not change in time, so that depth need not to be estimated on each frame.