6. Vision Processing: NVIDIA Jetson TX1 - FHS-Robotics/FRC-VSCode-Team5866 GitHub Wiki

About the Processor and Our Plan

The Jetson TX1 is a development board created by NVIDIA for extensive processing. We will be using it as a secondary processor communicating with the RoboRio to process our vision and OpenCV data created from our camera feed. This information will then be passed on to the Rio through a network table, and combined with the NavX Gyroscopic and Ultrasonic sensor data to home in on our cargo target.

Using a Coprocessor for Vision

This information is from WPILIB's page on vision processing:

Strategy

"Generally the idea is to set up the coprocessor with the required software that generally includes:

OpenCV - the open source computer vision library Network tables - to commute the results of the image processing to the roboRIO program Camera server library - to handle the camera connections and publish streams that can be viewed on a dashboard The language library for whatever computer language is used for the vision program The actual vision program that does the object detection The coprocessor is connected to the roboRIO network by plugging it into the extra ethernet port on the network router or, for more connections, adding a small network switch to the robot. The cameras are plugged into the coprocessor, it acquires the images, processes them, and publishes the results, usually target location information, to network tables so it is can be consumed by the robot program for steering and aiming."

We will connect the NVIDIA Jetson to the radio using ethernet.

Python

The current plan is to use python on the Jetson to recognize the tape, and then kick that information out to a Network Table (Used to send information to the SmartDashboard). There are Python libraries for the network tables that we will use to push our information and video feed out.

Reading NetworkTables information on the Rio

Here's a link that shows how to read the vision information we will be publishing to the Network Table.

Converting NetworkTable Positions into a Heading

The math to convert the position of our target onscreen into a heading goes as follows:

Find both the width resolution of the camera (we use 640 pixels) and look up the viewing angle of the camera (for the Microsoft Lifecam it is a horizontal viewing angle of 34.3 degrees)
Divide the resolution by the viewing angle (640 / 34.3 = 18.66 pixels/degree)
Now subtract the new heading by half of the viewing angle (34.3 / 2 = 17.15) in order to have a positive or negative angle by which to turn the robot.
So if the vision target was at x=50, then 50/18.66= 2.68 degrees, and 2.68-17.15= -14.47 degrees, so the robot needs to turn roughly 14.47 degrees to be aligned with the target.

This process is what the VisionManager class in our code performs. This class' variables are all static as of right now, so we can only have one camera being processed since we can't have more than one instance of any of the variables, but by changing the variables for the specs of the specific camera that is being used, all of the rest of the methods still work perfectly.

private static double camPixWidth; //default [640]
private static double camPixHeight; //default [480]
private static double viewingAngleH ; //default [34.3] degrees
private static double viewingAngleV; //default [60] degrees
public static double headConRatio; //heading conversion ratio from the position (in pixels) of a point to a heading angle

Reference Information and Documentation

Here is the websites used to gather research and learn how to accomplish vision processing using the Jetson

https://developer.nvidia.com/FIRST
1. This website contains general documentation and tutorials for the Jetson
https://robotpy.readthedocs.io/projects/pynetworktables/en/stable/
1. The documentation for pynetworktables
https://docs.opencv.org/3.4/dd/d49/tutorial_py_contour_features.html
1. Information about OpenCV contours