Vision

What's vision?

Vision is what vision sounds like: seeing the world. We use what the robot sees to let it understand the environment around it. This can be with all sorts of things: reflective tape, apriltags, and even normal everyday objects (object recognition).

What's the point?

Vision is extremely helpful in letting the robot figure out where it is. This has powerful implications. Normally, human drivers are the ones assessing the robot's surroundings and controlling the robot. But with vision, this process is automated! This means we can potentially have more accurate auton, on-the-fly auto path generation during teleop, and much more.

How do we do this?

There's two main components: hardware and software.

Hardware:

Raspberry pi a mini computer
global shutter camera(s) unlike rolling shutter, global shutter cameras take a picture at once. This means no distortion when moving

Software:

Photonvision vision software for frc. Does most of the complicated vision math for us (like solvepnp which maps 2d -> 3d)
odometry + vision pose estimator

Setup

Setting up hardware:

flash photonvision software onto pi (the micro sd card)
- balena etcher or raspberry pi imager + sd card reader
plug in cameras into pi
give pi power
connect pi to roborio via ethernet cable
- ensure photonvision is connected by connecting to robot network and accessing photonvision using "photonvision.local:5800"
- set pi to static ip, 10.6.70.x
alibrate cameras using checkerboard
- this is to let photonvision estimate the shape of the lens, so vision calculations are more accurate

Setting up software:

Software setup is made easy with VisionSubsystemBase.Config

AprilTagFieldLayout kFieldLayout - coordinates of each apriltag on the field
String[] kCameraIDs - camera names
Transform3d[] kCameraOffsets - 3d offsets of cameras from the robot's center using robot coordinate system
filters:
1. double kPoseAmbiguityCutOff - amount of ambiguity to filter out
2. int kMaxFrameFIDs - max number of apriltags in a frame to filter
3. List<Set<Integer>> kPossibleFIDCombinations - possible apriltag combos (figure these out with your brain)
Map<Integer, TagCountDeviation> kVisionStdFromTagsSeen - vision "trust values" (probability curve std devs) based on number of apriltags seen TagCountDeviation is an object that contains the three UnitDeviationParams: x, y, and theta

Computer vision pose estimation error with AprilTags

pose estimation solely through vision will have error
- if we plot a large sample of estimated poses at fixed distances to Apriltags, we get gaussian probability distributions
- the standard deviation of these probability distributions scales exponentially with respect to distance from Apriltags (we can graph with exponential function!)
- to solve this, we add encoders to the mix through a Kalman filter.
with multiple Apriltags, the standard deviation exponential functions vary
more Apriltags, less error. You can expect the function to be less steep as there are more Apriltags in view

Pose estimator

combination of vision and odometry (physical encoder values) to estimate the robot's pose (coordinate and angle)
how vision odometry pose combination works SwerveDrivePoseEsimator::addVisionMeasurement:

Step 0: If the vision measurement is old enough to be outside the pose buffer's timespan, skip.

Step 1: Get the pose odometry measured at the moment the vision measurement was made.

Step 2: Measure the twist (the difference between 2 poses) between the odometry pose and the vision pose.

Step 3: We should not trust the twist entirely, so instead we scale this twist by a Kalman gain matrix representing how much we trust vision measurements compared to our current pose. Kalman filter (sensor fusion algorithm) - optimal estimation algorithm

Step 4: Convert back to Twist2d.

Step 5: Reset Odometry to state at sample with vision adjustment.

Step 6: Record the current pose to allow multiple measurements from the same timestamp

Step 7: Replay odometry inputs between sample time and latest recorded sample to update the pose buffer and correct odometry.

The Kalman Filter in Vision

To ensure that our pose estimation values are accurate, we use a Kalman filter to take sensor values from vision and encoders and refine them into a better estimation of the robot's pose.

How do Kalman Filters works?

Noisy measurements have a probability distribution in which the true state lies in
to get an optimal estimate, multiply each probability distribution, which will result in a distribution with less variance
optimal state is the average of the multiplied probabiliy distributions
Here's a good video to watch
- note: kalman filter math is calculated in SwerveDrivePoseEsimator::addVisionMeasurement, if you want to see how it works, watch part 4 of the linked video

Determining Vision Probability for the Kalman Filter

Figuring out the vision probability distribution standard deviation exponential functions for each number of possible apriltags in view for $n$ apriltag(s):

$\text{stddev} = Ae^{Bx}, \text{where } x \text{ is the average distance from the } n \text{ Apriltags}$

In MustangLib, this is quantified in VisionSubsystemBase as UnitDeviationParameters(distanceMultiplier, eulerMultiplier, minimum)

$A$ - eulerMultiplier
$B$ - distanceMultiplier
$\text{minimum}$ - deviation cutoff (if it's too small, default to min)

To figure out each of the deviation functions:

plot multiple points with n seen apriltag(s) at the same average distance to the tag, and try to include mutiple angles - plot separate graphs for x, y, and theta
use a graphing calculator to find the standard deviation of these points for x, y, and theta (probability curve stdev)
repeat steps 1 and 2 at multiple average distances to the n seen apriltag(s)
estimate the stdev functions (x, y, and theta) through exponential regression
record the functions and the respective number of apriltags it corresponds with
repeat this process n times (with 1 apriltag, 2 apriltags, 3 apriltags, ... n apriltags seen) In VisionSubsystemBase, these deviation functions are stored as a Map, with the key being the number of tags seen, and the value being TagCountDeviation

Vision - HHS-Team670/MustangLib GitHub Wiki

Vision

What's vision?

What's the point?

How do we do this?

Hardware:

Software:

Setup

Setting up hardware:

Setting up software:

Computer vision pose estimation error with AprilTags

Pose estimator

The Kalman Filter in Vision

How do Kalman Filters works?

Determining Vision Probability for the Kalman Filter

To figure out each of the deviation functions:

⚠️ GitHub.com Fallback ⚠️

Vision - HHS-Team670/MustangLib GitHub Wiki

Vision

What's vision?

What's the point?

How do we do this?

Hardware:

Software:

Setup

Setting up hardware:

Setting up software:

Computer vision pose estimation error with AprilTags

Pose estimator

The Kalman Filter in Vision

How do Kalman Filters works?

Determining Vision Probability for the Kalman Filter

To figure out each of the deviation functions:

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️