Object detection - FontysAtWork/ESA-PROJ GitHub Wiki

Object detection

The robot needs to recognize several different objects, for example metal bars or screws. We're using a camera with depth sensor to achieve this.

Camera used

The camera used is a Intel RealSense SR300 (BlasterX Senz3D). We used this camera to test our implementation. This sensor contains a RGB camera, an infrared sensor and a laser projector.

Specifications

RGB Resolution: 1920x1080, 30FPS
IR Depth Resolution: 640x480, 60FPS
Range: 0.2m to 1.5m
Projector type: Class 1 IR Laser Projector

Previous implementation

The previous group did already have an implementation for object detection using the RealSense SR300. This implementation used the ROS ORK Tabletop packages:

realsense_camera
object_recognition_core
object_recognition_tabletop
object_recognition_reconstruction
Other supporting packages

Most of these packages are available on both ROS Indigo and ROS Kinetic, but the object_recognition_tabletop package is not available for ROS Kinetic and newer versions. As this package is essential for the object detection functionality, we had to search for an alternative solution for object detection to be able to work on ROS Kinetic.

Alternatives

We explored several alternatives to replace ORK:

Octomap

We tried to use OctoMap. We managed to get a world model when we had a usable tf from the robot, but we didn't manage to find a way to make an object model from the world model, nor did we find an API option to separate the objects from the surfaces they are resting on. After some additional research we figured out OctoMap wasn't suitable for tabletop object recognition, as it's designed for area mapping and collision avoidance. The idea was dropped.

Octomap object detection

Point Cloud Library

An alternative for the object detection is PCL (Point Cloud Library). This is a library which can be used for point cloud processing. The library contains useful functions to work with point clouds, and should be suitable for recognizing objects.

Besides PCL, we will keep using the realsense_camera ROS package. The realsense_camera ROS package is responsible for getting the camera input, which it will publish as sensor_msgs/PointCloud messages over multiple topics.

PCL Example

PCL proved not to contain a simple implementation for object recognition functionality. It is technically possible to implement object recognition using PCL, however this would take a lot of time. As we did not have that much time, we decided to look for an alternative.

Intel RealSense SDK

Intel also provides a SDK for RealSense object recognition. We had trouble getting the drivers to work properly and many of the documentation and usages were for an old version. Although we could find some samples for ROS and some samples for object recognition, we couldn't build them due to missing parts in the library installed by apt or even when compiling the library from source.

Manual recognition using OpenCV

After asking for advice on other ways to recognize objects, a teacher suggested to use OpenCV and trying to recognize objects visually. The coordinates found visually could then be laid over the depth map, thus making a single-perspective model. Multiple models like this could then describe an object.

Sadly we noticed a shift between the depth image and the color image , which adds an extra step to map the coordinates properly. Otherwise a point would get an invalid corresponding depth point. Sadly all existing implementations are only valid for older versions of the SDK, which has already been deprecated.

Still this seems to be a viable solution if the images are mapped correctly, as it doesn't depend on external libraries which aren't very compatible with newer distributions and/or other libraries.

Future recommendations

Due to the completely integrated ORK solution not being available any more in ROS Kinetic we had to find another way. We explored several options to realize this and we decided that we could best continue with going the OpenCV matching method. The other methods weren’t suitable for recognizing objects or would need implementation of some complex functionality. Time constraints and lack of working available pre-existing solutions made it hard to finish implementing this solution in time. If this is to be used further, we’d recommend to continue with the OpenCV recognition and then to make a recognition model with some depth info.

Pending issues

Some issues that still persists are related to the Intel RealSense packages and Ubuntu itself:

Depending on which kernel is used, some patches required may or may not work.
Depending on ROS and Ubuntu updates, some (ORK-related) packages might need unavailable dependencies.