Stereo vision background - julian-steiner/Waveshare-Stereo-Camera GitHub Wiki

Stereo Vision

Stereo vision in nature

Basically all animals with two eyes use stereo vision to estimate depth. Humans are an excellent example of this. Our brain is able to compute depth information just out of the two overlaying images of our eyes. The principle is basically the same as in this code and every other implementation of stereo vision on a camera.

Stereo vision in computer science

The same principle is used in computer vision. The images have to be rectified first in order to align the images horizontally. Then we can compute the disparity of which we can estimate the depth of the image.

Image rectification

In order to compute the depth you have to align the two images horizontally. This is done in a process called image rectification.

Rectification Image from Wikipedia

Both images get rotated and twisted in a way such that the same pixels align horizontally. This process is heavily dependant on the camera calibration because the images should have as few distortion as possible. There are three main algorithms which I won't go into detail about them because they are very complicated. They are planar rectification, cylindrical rectification (no paper available for free) and polar rectification.

Triangulation and displacement

To compute depth, the displacement of an object on the rectified image is used. How the displacement and the distance relate is visible on this image. Triangulation Image

The light passes from the object through both cameras and is registered on the image sensors. Because the light passes with different angles, the position on the sensor is different. The further away an object is from the camera, the smaller is the displacement because the rays become almost parallel. The closer an object to the camera, the higher is the displacement. The depth map then has to be processed and filtered. Visualisation is also a crucial step because the image would be way too dark if you were just to use the raw data.

Sources

Wikipedia : Stereopsis

Wikipedia : Image rectification

Wikipedia : Triangulation