all_seaing_perception - ArcturusNavigation/all_seaing_vehicle GitHub Wiki

Back: Documentation

Nodes

color_segmentation.py

Works best in simulation for detecting buoys. The node subscribes to camera images and performs color segmentation to detect buoys. The detected buoys are then published as bounding boxes. Only used in sim, we are using YOLO for obstacle detection.

Subscribed topics:

image (raw image from camera, can/will be remapped): a sensor_msgs/Image message containing the camera image to perform color segmentation on.

Published topics:

/bounding_boxes: an [all_seaing_interfaces/LabeledBoundingBox2DArray] message containing the bounding boxes of each of the masks detected.
image/segmented: a sensor_msgs/Image message containing the camera image with bounding boxes overlayed. This is useful for debugging.

Parameters:

color_ranges_file: a string parameter, defaults to ''. The file path to the config file that contains HSV min/max values for each color.
color_label_mappings_file: a string parameter, defaults to ''. The file path to the config file that contains the mapping between color and buoy object label.

yolov8_node.py

Runs the YOLOv8 model (the one trained on buoys or the one trained on banner shapes), and, given the image from the camera which it subscribes to, publishes the bounding boxes and the respective labels, corresponding to detected objects.

Subscribed topics:

image (should be remapped to the actual image topic): a sensor_msgs/Image message containing the camera image to perform object detection on.

Published topics:

/bounding_boxes: an [all_seaing_interfaces/LabeledBoundingBox2DArray] message containing the bounding boxes of each of the objects detected, with their respective labels.
annotated_image (should be remapped to an unique topic if running multiple YOLO instances): a sensor_msgs/Image message containing the camera image overlayed with the detection bounding boxes, colored appropriately, and their respective labels (useful for debugging)

Parameters:

model: the name of the trained YOLOv8 model to use (currently "roboboat_2025" or "roboboat_shape_2025")
label_config: the name of the .yaml configuration file (in all_seaing_bringup/config/perception) to use for the color-label mappings
conf: the confidence threshold which a detection needs to meet to be published
use_color_names: a flag that enables converting the published label names to be just the respective color instead of using the whole label

Color-Label Mapping YAML Files

buoy_label_mappings: each possible detected buoy YOLO label (from the buoy-trained YOLO model), mapped to the respective number
shape_label_mappings: each possible detected shape YOLO label (from the shape-trained YOLO model), mapped to the respective number
color_label_mappings: used by the color segmentation nodes, mapping each color to a number (not consistent with the YOLO model)
color_buoy_label_mappings: used by bbox_project_pcloud in simulation to be consistent with the color segmentation nodes, same as color_label_mappings without orange and white but with a "misc" label
inc_color_buoy_label_mappings: used by bbox_project_pcloud in real life, mapping each possible buoy label to a single color, for color segmentation inside the bounding boxes

obstacle_detector

Clusters the input point cloud into obstacles using the Euclidean clustering algorithm and publishes the computed clusters.

Subscribed topics:

point_cloud (remapped to the actual point cloud topic): an input sensor_msgs/PointCloud2 message to cluster. Typically published by the pointcloud_filter node, but can be directly used from the LiDAR.

Published topics:

obstacle_map/raw: the output all_seaing_interfaces/ObstacleMap message of the untracked obstacles. In other words, these obstacles are directly from the obstacles generated by euclidean clustering.

Parameters:

global_frame_id: the id of the global frame used, defaults to map
base_link_frame: the id of the robot frame used, defaults to base_link
obstacle_size_min: an integer parameter, defaults to 20. The minimum number of points required for a cluster.
obstacle_size_max: an integer parameter, defaults to 100000. The maximum number of points in a cluster.
clustering_distance: a double parameter, defaults to 0.75. Points closer to each other than this value (in meters) are considered as one obstacle.
obstacle_filter_pts_max: similar to obstacle_size_max, but instead of being applied during clustering, the resulting clusters with more points than this threshold are ignored.
obstacle_filter_area_max: clusters with a larger area than that will be ignored.
obstacle_filter_length_max: clusters with a minimum oriented bounding box (computed using PCA & computing min/max projected points to the resulting axes) with a side larger than this will be ignored.
range_max: obstacles whose centroid is further than this distance from the robot are ignored.

pointcloud_filter

Filters and downsamples a point cloud based on range and intensity. Filtered and/or downsampled point cloud may be preferred when clustering.

Subscribed topics:

point_cloud (remapped to the actual point cloud topic): an input sensor_msgs/PointCloud2 message to filter and/or downsample.

Published topics:

point_cloud/filtered: the output sensor_msgs/PointCloud2 message that is filtered and/or downsampled.

Parameters:

global_frame_id: the id of the global frame used, defaults to map
range_radius: an array of doubles, defaults to [0.0, 100000.0]. All points with range outside those values gets filtered.
range_intensity: an array of doubles, defaults to [0.0, 100000.0]. All points with intensity outside those values gets filtered.
range_x: the acceptable x range of points (filter outside that range)
range_y: the acceptable y range of points (filter outside that range)
range_z: the acceptable z range of points (filter outside that range)
leaf_size: a double parameter, defaults to 0.0. The size (in meters) of the 3D voxel grids when downsampling. If set to 0 (default), this node doesn't downsample.

bbox_project_pcloud

Projects the bounding boxes onto the point cloud (actually does the opposite, projects the point cloud points onto the image and checking if they are within a bounding box) and selecting the points that are in the bounding box of each object, creating a set of point clouds matched to detected objects from the camera. Then, it clusters (based on color difference and euclidean distance) the points that are in each object's point cloud, and performs color segmentation on the image depending on the color that YOLOv8/color segmentation algorithm detected, creating multiple clusters and contours, between which it then takes the optimal pair based on an optimality metric that takes into account the similarity to the color detected by YOLOv8, the distance between the cluster points when projected onto the image and the color segment points, and the size of the cluster and the color segments, thus selecting the cluster and color segment that best correspond to the object detected.

Subscribed topics:

camera_info_topic: a sensor_msgs/msg/CameraInfo message providing information about the camera necessary to convert 3D points to their 2D image projections and vice versa
image_topic: a sensor_msgs/msg/Image message containing the image obtained from the boat (or turret) camera
lidar_topic: a sensor_msgs/msg/PointCloud2 message containing the points obtained from the LiDAR and then filtered by point_cloud_filter.cpp
bounding_boxes: a all_seaing_interfaces/msg/LabeledBoundingBox2DArray message containing the bounding boxes detected by the YOLOv8 or the color segmentation node

Published topics:

labeled_object_point_clouds(/front, back_left, back_right): a all_seaing_interfaces/msg/LabeledObjectPointCloudArray message, which is an array of all_seaing_interfaces/msg/LabeledObjectPointCloud objects, containing the label of the detected object and the 3D points that are inside its bounding box when projected onto the image
object_point_clouds_viz(/front, back_left, back_right): a sensor_msgs/msg/PointCloud2 message containing the 3D points in the (projected) bounding box of each object, split in one channel for each object, for visualization in RViz
detections(/front, back_left, back_right): a all_seaing_interfaces/ObstacleMap message containing the detected obstacles (using the refined clusters), in the base_link frame
refined_object_point_clouds_viz(/front, back_left, back_right): a sensor_msgs/msg/PointCloud2 message containing the selected clusters that represent the detected objects, for visualization in RViz

Parameters:

camera_name: "front", "back_left", or "back_right" (or left empty), depending on the camera (image, bounding box, and camera info topics) the node subscribes to, defaults to ""
bbox_object_margin: the margin to add to the detected object's bounding box when checking if a projected 3D point is in it, defaults to 0.0
color_label_mappings_file: the path to the YAML file containing the mappings from colors to labels, defaults to all_seaing_bringup/config/perception/color_label_mappings.yaml
matching_weights_file: the path to the YAML file containing the weights of aspects of the optimality metric used to select the best cluster-image segment pair, defaults to all_seaing_bringup/config/perception/matching_weights.yaml
contour_matching_color_ranges_file: the path to the YAML file containing the mappings from colors to HSV value ranges for color segmentation and cluster-contour optimization/matching, defaults to all_seaing_bringup/config/perception/contour_matching_color_ranges_file.yaml
obstacle_size_min: the minimum number of points a detected cluster should have, defaults to 20
obstacle_size_max: the maximum number of points a detected cluster should have, defaults to 100000
clustering_distance: the maximum distance a point that's added to a cluster should have from at least one point already in it, defaults to 0.75
is_sim: whether we are running it on the sim or not, defaults to False
label_list: whether the yaml file contains a list or labels for each color name or if it's a one-to-one mapping, defaults to True
only_project: whether to skip the color segmentation & contour matching part and just publish the points in the point cloud that are inside the respective detection bounding boxes, suitable for detecting the points in dock banners or other shapes without a set color & shape, defaults to False

multicam_detection_merge.py

Merges the detections from the three cameras (already transformed to the base_link frame) into a single local map.

Subscribed topics:

detections/front, detections/back_left, detections/back_right: all_seaing_interfaces/ObstacleMap messages with the detections of the three cameras (from the individual bbox_project_pcloud instances)

Published topics:

detections/merged: the merged all_seaing_interfaces/ObstacleMap message

Parameters:

enable_front: whether to subscribe to the detections of the front camera, defaults to True
enable_back_left: whether to subscribe to the detections of the back left camera, defaults to True
enable_back_right: whether to subscribe to the detections of the back right camera, defaults to True
individual: whether to merge all the detections into a single ObstacleMap message by synchronizing the 3 respective topics (False) or publish them individually into a single topic (with much faster rate but with detections from the individual cameras as they come) (True), defaults to False
approximate: whether to use an approximate time synchronizer that accepts a delay value within which it will still match messages, or a synchronizer that only matches messages with the same timestamp, defaults to False
delay: the delay that's acceptable when using an approximate time synchronizer

object_tracking_map

Implements Extended Kalman Filter-based Simultaneous Localization And Mapping (EKF SLAM), updating a global map of the buoys and the robot's predicted position based on the detections from bbox_project_pcloud and GPS and IMU (including compass) measurements. Also publishes the map->base_link transformation to have a more accurate global map (or map->odom, depending on the desired TF tree structure, map->base_link still ends up being the robot's estimated pose). Can also be used (toggled) to just predict the buoys' locations in a global map given the robot's location, if we assume that is accurate enough, using individual EKFs for each of them, without subscribing to and publishing the respective odometry messages and TF frames.

Subscribed topics:

detections a all_seaing_interfaces/msg/LabeledObjectPointCloudArray message containing the obstacle detections (published by bbox_project_pcloud, see above)
odometry/filtered: a nav_msgs/Odometry message containing the robot's position.

Required transformations:

odom->base_link: provides the GPS (or fused) position measurement and the compass orientation, and can also be used as the child of slam_map in the TF tree

Published topics:

obstacle_map/global: a all_seaing_interfaces/ObstacleMap message with the tracked (as described above) and labeled obstacles, derived from the detections of the bbox_project_pcloud node
obstacle_map/map_cov_viz: a visualization_msgs/MarkerArray message containing the visualization of the object & robot pose means and covariances as computed by SLAM, as well as the predicted trajectory of the robot.
odometry/tracked: a nav_msgs/Odometry message containing the robot's predicted position with respect to the slam_map frame

Published topics:

obstacle_map/refined_tracked: a all_seaing_interfaces/ObstacleMap message with the tracked (as described above) and labeled obstacles, derived from the detections of the bbox_project_pcloud node
obstacle_map/map_cov_viz: a visualization_msgs/MarkerArray message containing the visualization of the object & robot pose means and covariances as computed by SLAM, as well as the predicted trajectory of the robot.
odometry/tracked: a nav_msgs/Odometry message containing the robot's predicted position with respect to the slam_map frame

SLAM Parameters

The ones that are most likely to be changed are:

track_robot: whether we want to use SLAM to keep track of the robot's position as well as the covariances between the buoy pose predictions, defaults to True, setting it to False will use an individual EKF for each obstacle and just compute a map, but not estimate the robot's pose and publish a TF
global_frame_id: the id of the global frame being used when SLAM is not running (providing the GPS/fused position measurement and used as the child of the SLAM global frame)
slam_frame_id: the id of the global frame published by the nodes based on the SLAM predictions of the robot pose (the one referred to here as slam_map)
gps_update: whether to use pose and orientation given from the TF (usually GPS pose + compass orientation) to update those predictions, along with the sensor data, defaults to True
include_odom_theta: whether to include the orientation given from the TF as a sensor measurement, because it is computed using the Pixhawk compass and is generally accurate, defaults to True
include_only_odom_theta: whether to only include the compass orientation as a measurement and not the GPS/fused position, when the GPS is really inaccurate or non-existent, defaults to False
include_unlabeled: whether to include unlabeled obstacle detections, from the LiDAR clusters (obstacle_map/unlabeled) to update the tracked obstacles' (that were first detected as labeled obstacles, unlabeled detections don't create new obstacles) positions and the robot's estimated position.

For more detailed information and configuration parameters (e.g. uncertainties & thresholds) check out Simultaneous Localization And Mapping (SLAM)