Depth Estimation with Stereo Pair - GreycLab/gmic-community GitHub Wiki
Conceptual Overview
This stub is about retrieving depth information from a pair of images taken from different points of view. Whereas an aligned stereo-gram will show only displacement from parallax effects - displacement inversely proportionate to distance - in general two photographs may be offset in other ways that require correction.
This task is related to "optical-flow" techniques used for in-betweening frames of a motion sequence. Here we assume that the just camera and not the subject has moved.
The inverse is also possible: to create a clean stereo-pair from a single image you'd apply warps inversely proportional to a depth-map. Those could then be processed to produce a single Red-Green anaglyph for instance - see Tom Keil's filters.
The human eye's stereo vision uses a combination of many cues to visualise objects in 3d space:
- parallax between the eyes
- knowledge of size / relative scale
- contrast / colour in far distance
- perspective and position on a plane
- surface angle implied by global lighting
- focus / depth of field
The discussion referenced here focused on mid-field distance estimation using parallax.
Related GMIC Commands
-displacement[dest_image] [source_image], smoothness 0.1, precision 5, scales auto, max iteration 10000, is_backward true
Optimises a 2d warp to minimise "energy" between one input image and a warped version of the other.
img src=https://sourceforge.net/p/gmic/wiki/_discuss/thread/ebd071ef/f01b/attachment/latex_render.png Latex: E(U) = \int_\Omega (I_1(X) - I_2(X+U))^2 + \alpha |\nabla U|^2
related Horn & Schunck
The formula's two terms are firstly a total 'fitting error' of values, secondly 'smoothness constraint' on the warp.
The algorithm solves the PDE derived by Euler-Lagrange from E(U), it incrementally refines the warp details starting at low resolution reduced size and zooming in to full scale.
advantages | disadvantages |
---|---|
smooth estimated displacement using texture | low sensitivity to cues from shading |
anisotropic smooth-ness sub-pixel resolution | point based not line based |
confused by specular highlights | requires images to have close image intensities |
works well with smaller displacements | doesn't allow for discontinuities at edges |
-phase_correlation[dest_image,source_image]
estimates a single translation vector in x,y by detecting the dominant frequency and direction of a phase difference in the fourier transforms
advantages | disadvantages |
---|---|
sensitive to edges | one vector for whole image |
sub-pixel resolution | need to break images into discrete patches |
robust to intensity differences | assumes no rotational or trapezoidal distortion |