Vision - HHS-Team670/MustangLib GitHub Wiki
Vision is what vision sounds like: seeing the world. We use what the robot sees to let it understand the environment around it. This can be with all sorts of things: reflective tape, apriltags, and even normal everyday objects (object recognition).
Vision is extremely helpful in letting the robot figure out where it is. This has powerful implications. Normally, human drivers are the ones assessing the robot's surroundings and controlling the robot. But with vision, this process is automated! This means we can potentially have more accurate auton, on-the-fly auto path generation during teleop, and much more.
There's two main components: hardware and software.
- Raspberry pi a mini computer
- global shutter camera(s) unlike rolling shutter, global shutter cameras take a picture at once. This means no distortion when moving
- Photonvision vision software for frc. Does most of the complicated vision math for us (like solvepnp which maps 2d -> 3d)
- odometry + vision pose estimator
- flash photonvision software onto pi (the micro sd card)
- balena etcher or raspberry pi imager + sd card reader
- plug in cameras into pi
- give pi power
- connect pi to roborio via ethernet cable
- ensure photonvision is connected by connecting to robot network and accessing photonvision using "photonvision.local:5800"
- set pi to static ip, 10.6.70.x
- alibrate cameras using checkerboard
- this is to let photonvision estimate the shape of the lens, so vision calculations are more accurate
Software setup is made easy with VisionSubsystemBase.Config
-
AprilTagFieldLayout kFieldLayout
- coordinates of each apriltag on the field -
String[] kCameraIDs
- camera names -
Transform3d[] kCameraOffsets
- 3d offsets of cameras from the robot's center using robot coordinate system - filters:
-
double kPoseAmbiguityCutOff
- amount of ambiguity to filter out -
int kMaxFrameFIDs
- max number of apriltags in a frame to filter -
List<Set<Integer>> kPossibleFIDCombinations
- possible apriltag combos (figure these out with your brain)
-
-
Map<Integer, TagCountDeviation> kVisionStdFromTagsSeen
- vision "trust values" (probability curve std devs) based on number of apriltags seenTagCountDeviation
is an object that contains the threeUnitDeviationParams
:x
,y
, andtheta
- pose estimation solely through vision will have error
- if we plot a large sample of estimated poses at fixed distances to Apriltags, we get gaussian probability distributions
- the standard deviation of these probability distributions scales exponentially with respect to distance from Apriltags (we can graph with exponential function!)
- to solve this, we add encoders to the mix through a Kalman filter.
- with multiple Apriltags, the standard deviation exponential functions vary
- more Apriltags, less error. You can expect the function to be less steep as there are more Apriltags in view
-
combination of vision and odometry (physical encoder values) to estimate the robot's pose (coordinate and angle)
-
how vision odometry pose combination works
SwerveDrivePoseEsimator::addVisionMeasurement
:Step 0: If the vision measurement is old enough to be outside the pose buffer's timespan, skip.
Step 1: Get the pose odometry measured at the moment the vision measurement was made.
Step 2: Measure the twist (the difference between 2 poses) between the odometry pose and the vision pose.
Step 3: We should not trust the twist entirely, so instead we scale this twist by a Kalman gain matrix representing how much we trust vision measurements compared to our current pose. Kalman filter (sensor fusion algorithm) - optimal estimation algorithm
Step 4: Convert back to Twist2d.
Step 5: Reset Odometry to state at sample with vision adjustment.
Step 6: Record the current pose to allow multiple measurements from the same timestamp
Step 7: Replay odometry inputs between sample time and latest recorded sample to update the pose buffer and correct odometry.
To ensure that our pose estimation values are accurate, we use a Kalman filter to take sensor values from vision and encoders and refine them into a better estimation of the robot's pose.
- Noisy measurements have a probability distribution in which the true state lies in
- to get an optimal estimate, multiply each probability distribution, which will result in a distribution with less variance
- optimal state is the average of the multiplied probabiliy distributions
-
Here's a good video to watch
- note: kalman filter math is calculated in
SwerveDrivePoseEsimator::addVisionMeasurement
, if you want to see how it works, watch part 4 of the linked video
- note: kalman filter math is calculated in
Figuring out the vision probability distribution standard deviation exponential functions for each number of possible apriltags in view for
In MustangLib, this is quantified in VisionSubsystemBase
as UnitDeviationParameters(distanceMultiplier, eulerMultiplier, minimum)
-
$A$ - eulerMultiplier -
$B$ - distanceMultiplier -
$\text{minimum}$ - deviation cutoff (if it's too small, default to min)
- plot multiple points with n seen apriltag(s) at the same average distance to the tag, and try to include mutiple angles - plot separate graphs for x, y, and theta
- use a graphing calculator to find the standard deviation of these points for x, y, and theta (probability curve stdev)
- repeat steps 1 and 2 at multiple average distances to the n seen apriltag(s)
- estimate the stdev functions (x, y, and theta) through exponential regression
- record the functions and the respective number of apriltags it corresponds with
- repeat this process n times (with 1 apriltag, 2 apriltags, 3 apriltags, ... n apriltags seen)
In
VisionSubsystemBase
, these deviation functions are stored as a Map, with the key being the number of tags seen, and the value being TagCountDeviation