Simultaneous Localization And Mapping (SLAM) - ArcturusNavigation/all_seaing_vehicle GitHub Wiki

Back: Documentation

Overview

While the foundation of our localization system is the GPS and the Pixhawk, there are various cases where that's not enough to achieve robustness and consistent performance. That might be due to environmental factors causing GPS drift, or inherent limitations of the sensors themselves (the GPS speed is at most 5Hz, and the velocities computed from the fused GPS and Pixhawk data are not that accurate due to IMU drift, amplified even by small vibrations of the boat or abrupt acceleration/deceleration).

These factors make creating a consistent and clean map of the course and accurately localizing within it really difficult, so we are solving both of those problems using a principle called Simultaneous Localization And Mapping (SLAM), which, as the name suggests, combines localization and mapping into a single problem that needs to be solved, and uses the data from sensors like the LiDAR and the cameras, as well as the odometry provided in terms of GPS positions and fused IMU velocities, to both create a map of the environment and localize the robot within that map. That way, it takes into account the uncertainty of both the odometry data and the object detections to fill the gaps of both individual processes. It can also be used in GPS-disabled environments, and thus facilitate indoor testing, given a reasonable odometry source (usually LiDAR-Inertial Odometry, which you can learn more about in the Localization wiki page).

The core probabilistic tool behind SLAM is the Extended Kalman Filter, which can be used to fuse measurements of different types into estimating a dynamic state, if it's provided with the correct probabilistic models for the sensor measurements given the state (e.g. object range and bearing given a map and the robot's location) and the transition between states given a control input (in our case, the odometry, which helps estimate the robot's new pose given the previous one). You can read more about SLAM in the book "Probabilistic Robotics", by Sebastian Thrun.

State

We are estimating a multivariate probability distribution that consists of the robot's pose in the 2D plane (x, y position, and orientation, the angle theta clockwise from the x axis), as well as the x, y coordinates of each observed landmark. The set of landmarks represents the computed map of the environment, which also helps the robot localize itself.

We assume that the computed distribution is Gaussian, and will preserve this assumption by only applying linear prediction and update transitions, using the odometry and the sensor measurements, respectively. The described state is represented with an N-dimensional vector, which is the mean of the distribution, as well as an NxN covariance matrix, which depicts the correlation between the different variables, and their respective uncertainties (so that measuring one piece of information that may concern a subset of the variables estimated will also provide information regarding the rest of them and decrease their uncertainty).

The above distribution will be updated each time new data arrives, as we are computing a dynamic state.

Sensor Measurements

In our case, the sensor measurements being fed to the Kalman Filter are the object detections in terms of range (distance from the robot) and bearing (angle wrt the front of the robot). Those are provided by merging the detections of each bbox_project_pcloud node instance (one for each camera), which combine the YOLOv8 bounding boxes and respective camera images with the LiDAR point cloud to estimate the detected obstacles' positions in the base_link frame (the robot's local reference frame). The xy coordinates in the local frame can easily be converted to range and bearing (cartesian -> polar coordinate transformation).

For applying those, we assume uncorrelated Gaussian noise in both the range and the bearing components, with the mean of the measurement being the respective range and bearing computed from the map and the robot's position.

We also use another type of sensor measurement, which is the GPS position combined with the compass orientation. We assume Gaussian uncorrelated noise for those, with the mean being the mean predicted robot pose and orientation (from the computed state).

In terms of odometry, we are getting a measurement of the linear velocity of the robot, in both the x and y axes of the robot's frame, and an angular velocity measurement. These are provided by the Pixawk's internal EKF, which fuses GPS data with IMU measurements, and are used as control inputs, which increase the uncertainty of the computed state (contrary to the sensor measurements and the GPS position + compass orientation measurement described below, which decrease the uncertainty of the EKF).

We assume that the robot moves in a trajectory with constant linear and angular velocity, as dictated by the x and y velocity components, for the time during which those measurements are applied. We also assume Gaussian, uncorrelated, noise in the robot's updated x, y coordinates, and orientation.

Node, topics, and parameters

The node that subscribes to the required sensor data & odometry and performs the SLAM computations is object_tracking_map. The subscribed and published topics are detailed in all_seaing_perception wiki page.

The full list of parameters for SLAM is given below:

track_robot: whether we want to use SLAM to keep track of the robot's position as well as the covariances between the buoy pose predictions, defaults to True, setting it to False will use an individual EKF for each obstacle and just compute a map, but not estimate the robot's pose and publish a TF
global_frame_id: the id of the global frame when not running SLAM (providing the GPS/fused position measurement and used as the child of the SLAM global frame)
slam_frame_id: the id of the global frame published by the node based on the SLAM predictions of the robot pose
duplicate_thresh: the distance of two tracked obstacles to be considered duplicates (if so, then the one that was last seen the earliest will be deleted, for more robust data association)
obstacle_drop_thresh: the time threshold (in seconds) to remove a tracked obstacle if it has not been detected for that time
range_drop_thresh: the distance from the robot within which we delete old obstacles (to not clean up far parts of the map where we don't currently detect obstacles)
normalize_drop_thresh: whether the obstacle drop time threshold should be proportional to its distance, defaults to False
normalize_drop_dist: (if normalizing) the distance in which the obstacle drop time threshold will be the one set, defaults to 1.0
odom_refresh_rate: the rate (Hz) at which we want to lookup the map->base_link transformation giving the GPS (& fused IMU possibly) predicted robot location, used for the SLAM prediction if accurate & non-laggy enough
range_uncertainty: the uncertainty in the detected distances to the buoys
bearing_uncertainty: the uncertainty of the detected angles of the buoys relative to the camera
new_object_slam_threshold: the maximum value of a detection's squared Mahalanobis distance to a tracked obstacle (approximately the square of the distance to its predicted pose divided by the uncertainty of that prediction) to assign it to an obstacle
init_new_cov: the uncertainty that a new obstacle prediction should have when initialized, before updating it with the measurement itself (usually quite big, will get overwritten by the measurement uncertainties)
include_odom_theta: whether to include the odometry message orientation in the prediction, because it is computed using the Pixhawk compass and is generally accurate, defaults to True
include_only_odom_theta: whether to only include the compass orientation as a measurement and not the GPS/fused position, when the GPS is inaccurate or non-existent, defaults to False
rotate_odom: for when the Pixhawk is rotated with respect to base_link, whether to rotate the odom message velocities accordingly, defaults to False -> deprecated, use odometry_publisher and an appropriate configuration instead
init_xy_noise: the initial robot x,y coordinates' noise
init_theta_noise: the initial robot orientation noise
init_new_cov: the initial x,y variance of a newly detected buoy's predicted position
motion_imu_xy_noise: the noise of the robot's predicted pose using the fused velocities, from the odometry message (prediction step)
motion_imu_theta_noise: the noise of the robot's predicted orientation using the given angular velocity, from the odometry message (prediction step)
data_association: the data association algorithm being used, can be one of: greedy_exclusive (default), greedy_exclusive_indiv_var, greedy_exclusive_measurement_var, greedy_individual, linear_sum_assignment (suffix _sqrt for optimizing the sum of Mahalanobis distances and not their squares) -> check out object_tracking_shared.cpp for the algorithms
trace_time: (for visualization purposes) the time within which to record the robot's pose and display the trace in RViz
include_unlabeled: whether to include unlabeled obstacle detections, from the LiDAR clusters (obstacle_map/unlabeled) to update the tracked obstacles' (that were first detected as labeled obstacles, unlabeled detections don't create new obstacles) positions and the robot's estimated position.
update_gps_xy_uncertainty: the uncertainty used when updating the position prediction based on the GPS measurements (update step)
update_odom_theta_uncertainty: the uncertainty used when updating the orientation prediction based on the IMU compass measurements (update step)
direct_tf: whether to publish the slam_map->base_link transform, making it independent of whether (and when) the map->base_link transform is being published (thus making it faster regardless of the GPS speed), or instead publish the correction slam_map->map, defaults to True
imu_predict: whether to predict the IMU data to predict the robot's position over time (if False then it uses GPS predictions), defaults to True -> should almost always be set to True
gps_update: if predicting using IMU, then whether to use the GPS measurements to update those predictions, along with the sensor data, defaults to True -> should almost always be set to True (even if provided by an odometry source like LIO, then set include_only_odom_theta to True)
motion_gps_xy_noise: the noise of GPS location (from the TF) when used in the prediction step instead of the IMU -> mostly useless, since using GPS as a control input (instead of sensor measurement) doesn't really make sense
motion_gps_theta_noise: the noise of the given orientation (from the TF, most probably compass heading) when used in the prediction step -> mostly useless, for the same reason as the above parameter