Segnet ROS - arieldo/MorBot GitHub Wiki
In this tutorial, we are going to create a ROS package for utilize the capabilities of the SegNet deep learning model for semantic segmentation in ROS
- JetBot environment set up Jetson Nano and Linux Setup
- ROS environment set up on your jetson , IF not Check out HowTo ROS
- Basic knowledge of ROS concepts, such as nodes, topics, and messages ,IF not Check out ROS 101
- Basic knowledge of Publisher and Subscriber in Python for ROS ,IF not Check out ROS PubSubPy
- Basic knowledge of Creating new Packages in ROS
- Jetbot Camera publishing package
- Basic knowledge of Python programming language
SegNet is a deep learning architecture for pixel-wise semantic segmentation. It classifies each pixel in an image into a pre-defined category, making it particularly useful for tasks such as scene understanding, obstacle detection, and robot navigation.
- pub_frequency: Desired publishing
frequency
, default is set to30 Hz
. - Model loading: Use
segNet("fcn-resnet18-sun")
to load the appropriate model for segmentation.
-
Subscribed Topics:
-
/camera/image_raw
(sensor_msgs/Image
): The raw camera image.
-
-
Published Topics:
-
/segnet_estimations/overlay
(sensor_msgs/Image
): The overlayed image. -
/segnet_estimations/positions
(sensor_msgs/PointCloud
): Centroid positions of the segmented areas. -
/image_size
(std_msgs/Int32MultiArray
): The size of the input image.
-
Once the node receives an image from the /camera/image_raw
topic, it processes the image using the SegNet model. The resulting segmentation is then published as an overlay on the original image and as centroid positions.
Safety is paramount when working with robots. The SegNet ROS node includes logic to halt the robot if the detected area of interest's centroid is too close.
For steering:
- The robot aims to keep the centroid of the detected area at the center of its field of view.
- The node calculates the error between the centroid's position and the image center.
- Based on this error, a steering angle is determined and the robot turns to minimize this error.
- Create a new package named
segnet
inside your workspace directory (~/MorBot/morbot_ros/morbot_ws/src
):
cd ~/MorBot/morbot_ros/morbot_ws/src
catkin_create_pkg segnet rospy sensor_msgs vision_msgs std_msgs geometry_msgs
- Create a new directory named
scripts
inside your package directory (~/MorBot/morbot_ros/morbot_ws/src/segnet
):
cd ~/MorBot/morbot_ros/morbot_ws/src/segnet
mkdir scripts
- Create a new source file inside the
scripts
directory. For example, create a Python file namedsegnet_ros_node.py
:
cd scripts
touch segnet_ros_node.py
Make the script executable:
cd ~/MorBot/morbot_ros/morbot_ws/src/segnet/scripts
chmod +x segnet_ros_node.py
- The
segnet_ros_node.py
#!/usr/bin/env python
import sys
import rospy
import cv2
import numpy as np
import jetson.utils
from jetson_inference import segNet
from cv_bridge import CvBridge, CvBridgeError
from sensor_msgs.msg import Image ,PointCloud
from geometry_msgs.msg import Point32
from std_msgs.msg import Int32MultiArray
bridge = CvBridge()
rospy.init_node('segnet_estimation_node', anonymous=True)
# Fetch desired publishing frequency from ROS parameter server (defaulting to 10 Hz if not set)
pub_frequency = rospy.get_param('~pub_frequency', 30)
rate = rospy.Rate(pub_frequency) # Set the rate for the loop
# Publishers
pub_overlay = rospy.Publisher("/segnet_estimations/overlay", Image, queue_size=10)
pub_positions = rospy.Publisher("/segnet_estimations/positions", PointCloud, queue_size=10)
size_pub = rospy.Publisher('/image_size', Int32MultiArray, queue_size=10)
# Loading the segNet model
net = segNet("fcn-resnet18-sun")
def image_callback(msg):
global net
try:
cv2_img = bridge.imgmsg_to_cv2(msg, "bgr8")
height, width,channels = cv2_img.shape
except CvBridgeError as e:
print(e)
return
cv2_img = cv2.cvtColor(cv2_img, cv2.COLOR_RGBA2RGB)
cuda_img = jetson.utils.cudaFromNumpy(cv2_img)
net.Process(cuda_img)
# Get class mask (this will contain class IDs for each pixel)
class_mask_cuda = jetson.utils.cudaAllocMapped(width=cuda_img.width, height=cuda_img.height, format='gray8')
net.Mask(class_mask_cuda, class_mask_cuda.width, class_mask_cuda.height)
class_mask_np = jetson.utils.cudaToNumpy(class_mask_cuda)
# Here we filter out everything but the desired class.
# Let's say the class ID of the desired class is 2 (see below)
desired_class_id = 2
class_mask_np[class_mask_np != desired_class_id] = 0
class_mask_np[class_mask_np == desired_class_id] = 255 # This sets the desired class pixels to white
# Overlay this mask with original image. This step might need further adjustments based on your need.
overlayed_img = cv2.bitwise_and(cv2_img, cv2_img, mask=class_mask_np)
overlay_img_msg = bridge.cv2_to_imgmsg(overlayed_img, encoding="rgb8")
pub_overlay.publish(overlay_img_msg)
ys, xs = np.where(class_mask_np[:,:,0] == 255)
centroid_x = xs.mean() if len(xs) else None
centroid_y = ys.mean() if len(ys) else None
# Populate the PointCloud message only if there's a valid centroid
if centroid_x is not None and centroid_y is not None:
msg = PointCloud()
msg.header.stamp = rospy.Time.now()
msg.header.frame_id = "camera_frame" # or whatever your frame is
msg.points = [Point32(centroid_x, centroid_y, 0)]
# Publish the message
pub_positions.publish(msg)
# Publish input img size
size_msg = Int32MultiArray()
size_msg.data = [width, height]
size_pub.publish(size_msg)
rate.sleep() # This makes sure the loop runs at the desired frequency
def main(args):
rospy.Subscriber("/camera/image_raw", Image, image_callback,queue_size=10)
try:
rospy.spin()
except KeyboardInterrupt:
print("Shutting down ROS segNet estimation module")
if __name__ == '__main__':
main(sys.argv)
- Create a new directory named
launch
inside your package directory (~/MorBot/morbot_ros/morbot_ws/src/segnet
):
cd ~/MorBot/morbot_ros/morbot_ws/src/segnet
mkdir launch
- Create a new launch file inside the
launch
directory. For example, create a launch file namedsegnet_ros.launch
:
cd launch
touch segnet_ros.launch
- The
segnet_ros.launch
<launch>
<!-- Start the camera publisher -->
<include file="$(find camera_publish)/launch/camera_publisher.launch" />
<!-- Start the posenet node node -->
<node name="segnet_node" pkg="segnet" type="segnet_ros_node.py" output="screen" />
</launch>
- Run the launch file:
roslaunch segnet segnet_ros.launch
-
Open a web browser and navigate to
http://<jetbot_ip_address>:8080/stream?topic=/segnet_estimations/overlay
to view the camerasegnet_estimations
-
Check the published info:
rostopic echo /segnet_estimations/positions
That's it! You have created a ROS package for semantic segmentation using segNet.
The SUN RGB-D dataset provides segmentation ground-truth for many indoor objects and scenes commonly found in office spaces and homes.
The dataset contains 10,335 RGB-D images, where each image is annotated with: