AprilTags and PhotonVision - frc-7603/VespaRobotics2024-25 GitHub Wiki

April Tags

"AprilTags are a system of visual tags developed by researchers at the University of Michigan to provide low overhead, high accuracy localization for many different applications."

AprilTags are useful for helping the robot know where it on the field, so it can align itself to the wanted position.

"AprilTags are similar to QR Codes, in that they are a type of two-dimensional bar code. However, they are designed to encode far smaller data payloads (between 4 and 12 bits), allowing them to be detected more robustly and from longer ranges. Further, they are designed for high localization accuracy— you can compute the precise 3D position of the AprilTag with respect to the camera."

Inside Vespa Robotics, we would use 3D tracking, instead of 2D, according to the AprilTags lesson we had. Using 2D Tracking on the Photon Vision Dashboard will show the April tag number.

AprilTags have been in development since 2011, and have been refined over the years to increase the robustness and speed of detection.

Additional information about the tag system can be found on their website.

Summary of AprilTags and PhotonVision at the bottom of the page.

3D Alignment

"Each image is searched for AprilTags using the algorithm described on this page. Using assumptions about how the camera’s lense distorts the 3d world onto the 2d array of pixels in the camera, an estimate of the camera’s position relative to the tag is calculated. A good camera calibration is required for the assumptions about its lens behavior to be accurate."

The tag’s ID is also decoded. From the image. Given each tag’s ID, the position of the tag on the field can be looked up.

Knowing the position of the tag on the field, and the position of the camera relative to the tag, the 3D geometry classes can be used to estimate the position of the camera on the field.

If the camera’s position on the robot is known, the robot’s position on the field can also be estimated.

These estimates can be incorporated into the WPILib pose estimation classes. AprilTags by allowing 3D pose estimation, which includes both the position and orientation of the tag in 3D space.

2D to 3D Ambiguity

"The process of translating the four known corners of the target in the image (two-dimensional) into a real-world position relative to the camera (three-dimensional) is inherently ambiguous. That is to say, there are multiple real-world positions that result in the target corners ending up in the same spot in the camera image."

Humans often use lighting or background objects to determine an object's location and orientation. Though in a similar fashion, computers may often get tricked due to an object's similar looking orientation. Ex. A 2d image rotated 20 degrees up and a 2d image rotated 20 degrees down looks the same.

We can determine the correct position:

Use the odometry history to pick position relative to the last position.
"Reject poses which are very unlikely (ex: outside the field perimeter, or up in the air)".
"Ignore pose estimates which are very close together (and hard to differentiate)".
Use multiple cameras to estimate position.
Look at multiple targets to estimate location.

How It Works: (Short Summary)

The system captures images using a camera.
The detection algorithm searches for the square shape of AprilTags, decodes the binary ID, and computes the 3D pose
based on the known tag size and camera parameters.

PhotonVision

PhotonVision is free, fast, easy-to-use vision processing solution for the FIRST Robotics Competition. PhotonVision is designed to get vision working on your robot quickly and PhotonVision has multi-camera support.

Installing PhotonLib

Click on the WPI icon in your VS Code window (WPILib's VSCode) or hit Ctrl+Shift+P (Cmd+Shift+P on macOS) to bring up the command palette. Type, “Manage Vendor Libraries” and select the “WPILib: Manage Vendor Libraries” option. Then, select the “Install new library (online)” option.

Paste the following URL into the box that pops up:

https://maven.photonvision.org/repository/internal/org/photonvision/photonlib-json/1.0/photonlib-json-1.0.json

Hardware

Here at VespaRobotics we have the hardware to run PhotonVision being Raspberry Pi 5 and a HD webcam. Wire the PI into an aux port on the FRC radio Do not use PoE just plug the usb C in its easier (It is possible to used the GPIO board pins to power the Pi.)

What is a PhotonCamera?

PhotonCamera is a class in PhotonLib that allows the users to interact with one camera that is connected to robot that is running PhotonVision. Through this class, users can retrieve yaw, pitch, roll, robot-relative pose, latency, and other information. more info

Instantiating

To create a PhotonCamera instance, instantiate it like with any other class:

camera = new PhotonCamera("photonvision");

camera = new PhotonCamera("FHD_Camera");

This is the current Instantiating Camera code for the HD Camera.

@Override
    public void robotInit() {
        // Initialize PhotonCamera
        camera = new PhotonCamera("FHD_Camera");
        
        // Start camera streaming (optional for visualization)
        UsbCamera usbCamera = CameraServer.startAutomaticCapture();
        usbCamera.setResolution(352, 288);
    }

About Pipelines

What is a pipeline?

A vision pipeline is a series of steps that are used to acquire an image, process it, and analyzing it to find a target. In most FRC games, this means processing an image in order to detect an AprilTag.

Vision Pipelines

A pipeline is a sequence of steps to process the image from the camera, including:

Image acquisition.
Image processing (e.g., detecting AprilTags).
Target analysis.

Getting the Pipeline Result

A pipeline result is simply just an object which contains aspects about all Apriltag targets that the camera detects.

To get the latest result (what targets the camera detects at the time of the method being executed), simply use the .getLatestResult() method on the camera.

PhotonPipelineResult result = camera.getLatestResult();

Getting targets

To get a list (not an arraylist nor an array) of targets, simply create a List of PhotonTrackedTarget objects.

List<PhotonTrackedTarget> allTargets = allResults.getTargets();

Alternatively, get the best target that it detects.

PhotonTrackedTarget bestTarget = allResults.getBestTargets();

Getting info

Getting Target Data

Call any of these methods on the target object (copied directly from the "getting target data" documentation):

double getYaw()/GetYaw(): The yaw of the target in degrees (positive right).
double getPitch()/GetPitch(): The pitch of the target in degrees (positive up).
double getArea()/GetArea(): The area (how much of the camera feed the bounding box takes up) as a percent (0-100).
double getSkew()/GetSkew(): The skew of the target in degrees (counter-clockwise positive).
double[] getCorners()/GetCorners(): The 4 corners of the minimum bounding box rectangle.
Transform2d getCameraToTarget()/GetCameraToTarget(): The camera to target transform.
int getFiducialId(): Get the ID of the AprilTag.

Raspberry Pi Installation

To run and use PhotonVision on a Raspberry Pi, follow these steps.

Prerequisites

Hardware

Raspberry Pi (3 or 5, or later recommended for better performance)
Access to a monitor, keyboard, and mouse (For testing)

Software

PhotonVision installation image or jar file (The image is on the Photon Vision GitHub (Make sure you download the image that ends in ‘-RaspberryPi.xz’.)
Raspberry Pi Imager tool or other software to flash the OS

Access PhotonVision Web Dashboard

If PhotonVision is already running on the Pi:

Connect your computer to the same network as the Raspberry Pi. Open a web browser and go to:

http://photonvision.local:5800

Link to PhotonVision doc.

Link to Photon Vision GitHub

Other OS

In theory, it is possible to run it on a conventional Linux computer as well. Currently, it runs on a normal computer (laptop), but camera does not work. It may be possible to use a USB camera on a conventional Linux computer.

Camera Tuning / Input

PhotonVision’s “Input” tab contains settings that affect the image captured by the currently selected camera. This includes camera exposure and brightness, as well as resolution.

Camera Calibration

When a camera that is being used for PhotonVision hasn't been used for a bit, it is required to be re-calibrated. This re-calibration can be done using a black and white check board pattern

Calibration tips for a more accurate result:

Ensure your the images you take have the target in different positions and angles, with as big of a difference between angles as possible. It is important to make sure the target overlay still lines up with the board while doing this. Tilt no more than 45 degrees.
Use as big of a calibration target as the printer can print.
Ensure that your printed pattern has enough white border around it.
Ensure your camera don't move during the duration of the calibration.
Make sure you get all 12 images from varying distances and angles.
Take at least one image that covers the total image area, and generally ensure that you get even coverage of the lens with your image set.
Have good lighting, having a diffusely lit target would be best (light specifically shining on the target without shadows).
Ensure the calibration target is completely flat and does not bend or fold in any way. It should be mounted/taped down to something flat.
Avoid having targets that are parallel to the lens of the camera / straight on towards the camera as much as possible. You want angles and variations within your calibration images.

PhotonVision Colour & Shape Recognition

PhotonVision is capable of recognizing both shapes and colours.

Setting up PhotonVision with shapes and colors

Contours tab

Target Orientation: Landscape (Others may work; not tested)

Target Sort (0-4000): Changes the order in which targets are sorted

Area (0-100): The Minimum/Maximum area of the given shape

Fullness (0-66): The required fullness of the shape (the threshold of how full the shape needs to be to be considered a shape)

Speckle Rejection (4): How many speckles should be rejected

Target Shape (Circle): The shape that needs to be detected

Circle match distance (5): How close the centroid of a contour must be to the center of the circle in order for them to be matched

Max Canny Threshold (90): Sets the amount of change between pixels needed to be considered an edge

Shape Accuracy (10): How accurate a target must be in order to be detected as a shape

Radius (100): Percentage of the frame that the radius of the circle represents

Threshold Tab

Hue (58-91, Green to Light Blue): The range the hue must be in order to be detected

Saturation (133-255): The range the saturation must be in order to be detected

Value (32-207): The darkness value range of the color needs to be in order to be detected

Invert Hue (False): Inverts hue

Targets

API

Warning:

NetworkTables is not a supported setup/viable option when using PhotonVision as we only send one target at a time. This is problematic when using AprilTags, which will return data from multiple tags at once. We recommend using PhotonLib.

Getting Target Information

Key	Type	Description
rawBytes	byte[]	A byte-packed string that contains target info from the same timestamp.
latencyMillis	double	The latency of the pipeline in milliseconds.
hasTarget	boolean	Whether the pipeline is detecting targets or not.
targetPitch	double	The pitch of the target in degrees (positive up).
targetYaw	double	The yaw of the target in degrees (positive right).
targetArea	double	The area (percent of bounding box in screen) as a percent (0-100).
targetSkew	double	The skew of the target in degrees (counter-clockwise positive).
targetPose	double[]	The pose of the target relative to the robot (x, y, z, qw, qx, qy, qz).
targetPixelsX	double	The target crosshair location horizontally, in pixels (origin top-right).
targetPixelsY	double	The target crosshair location vertically, in pixels (origin top-right).

Changing Settings

Key	Type	Description
pipelineIndex	int	Changes the pipeline index.
driverMode	boolean	Toggles driver mode.

Saving Images

PhotonVision can save images to file on command. The image is saved when PhotonVision detects the command went from false to true.

PhotonVision will automatically set these back to false after 500ms.

Be careful saving images rapidly - it will slow vision processing performance and take up disk space very quickly.

Images are returned as part of the .zip package from the "Export" operation in the Settings tab.

Key	Type	Description
inputSaveImgCmd	boolean	Triggers saving the current input image to file.
outputSaveImgCmd	boolean	Triggers saving the current output image to file.

Warning:

If you manage to make calls to these commands faster than 500ms (between calls), additional photos will not be captured.

Global Entries

These entries are global, meaning that they should be called on the main PhotonVision table.

Key	Type	Description
ledMode	int	Sets the LED Mode (-1: default, 0: off, 1: on, 2: blink).

Warning:

Setting the LED mode to -1 (default) when multiple cameras are connected may result in unexpected behavior. This is a known limitation of PhotonVision. Single camera operation should work without issue.

Pipeline

Setting the Pipeline Index

You can use the setPipelineIndex()/SetPipelineIndex() (Java and C++ respectively) to dynamically change the vision pipeline from your robot program.

// Change pipeline to 2
camera.setPipelineIndex(2);

Getting the Pipeline Latency

You can also get the pipeline latency from a pipeline result using the getLatencyMillis()/GetLatency() (Java and C++ respectively) methods on a PhotonPipelineResult.

// Get the pipeline latency.
double latencySeconds = result.getLatencyMillis() / 1000.0;

Summary of AprilTags and PhotonVision

AprilTags Overview

AprilTags are visual tags similar to QR codes, designed for high-accuracy 3D localization with low data payloads (4–12 bits).
Useful for robots to determine their position and orientation on the field.
Developed since 2011 and rely on robust camera calibration for accurate 3D pose estimation.

Key Features

3D Alignment: Calculates the camera's 3D position relative to a tag using camera parameters and field positions.
Ambiguity Handling: Resolves 2D-to-3D ambiguities using odometry, pose filtering, multiple cameras, or tag analysis.

PhotonVision

A free vision processing tool designed for FRC robots.
Supports multiple cameras and offers tools for detecting AprilTags.

Getting Started with PhotonVision

Install PhotonLib via VS Code (WPILib).
Use Raspberry Pi 5
Access the PhotonVision web dashboard via http://photonvision.local:5800.

Pipeline Features

Processes camera images to detect tags, analyze targets, and compute pose.
Retrieve results and target data through methods like .getLatestResult() or .getTargets().

Camera Calibration Tips

Use a black-and-white checkerboard pattern, ensuring it’s flat and well-lit.
Take 12+ images at varying angles and distances for accurate calibration.

More info

Link to April Laboratory.

Link to FRC

Link to PhotonVision

Link to Photon Vision GitHub