Computer Vision - RoBorregos/robocup-home GitHub Wiki

Computer Vision

For the computer vision module, excluding object detection, is mainly done using the Intel RealSense camera and applying algorithms to the depth, cloud point, and color data.

Camera RealSense

The one used is the camera D435i. It varies a lot but can give very good results; in general, when it works, the accuracy is of millimeters.

Features

It offers several resolutions and fps configurations.
You can individually access the depth, color, infra 1 and infra 2 frames.
It already provides several useful filters that improve the depth frames.
Sometimes it gives horrible depth results even after had given to the same scene good results.
Support to align color and infra frames to/from depth frames.
It has several utilities in the library like deprojecting points to real measures, point clouds, etc.

RealSense ROS node

There is a "wrapper" for ROS that launches a node to handle the camera: checks what camera is connected, configures it, and sets a bunch of useful topics with the camera params, color, depth, point-cloud, aligned-to, etc. It is actually very handy as you can set a lot of configuration you would want to like filters, resolution, fps, align-to(s), pointclouds, etc.

It seems that the restriction of storing too many frames can leave without memory the camera, still holds even when the frame is used after a topic.
Some features of the node seem to be incomplete/unimplemented... even after years.

Notes and Tips

The camera seems to keep old configuration values. Then, try to always set all important configuration values.
The realsense node uses image_transport for publishing the images. Anyway, you can access the "raw" image at the topic root: /camera/color|depth|etc/image_raw.
The docs says that you shouldn't hold too many frames because the camera will run out of memory. The reason seems unknown, but it easily happens.
There are some depth filters that give good results, but others, like hole-filling, tend to, i.e., invent almost half of the frame. Also, check the different density modes.
The option of align-to color to/from depth is key and powerful, but check the results of doing it in each directions.
The use of the "pipe" to access the camera gives good options to choose the camera (by model, by serial num, etc.), even it suppose can automatically handles disconnections/changes.
Seems that (all of?) the processing is done in the device; only be careful that the device can get hot.

Algorithm Overview

Coming soon...

Working on