Old Vision System - northern-bites/nbites GitHub Wiki

NOTE: This page is about the old vision system (pre summer 2015)

The Northern Bites vision system consists of a variety of steps. The individual steps here can be quite complex and will have their own pages. The main vision loop described here takes place mainly in Threshold.cpp which can be found in src/man/vision/

  1. Transfer the image from the camera.
  2. Perform Color Segmentation on the raw image while simultaneously scanning for edge points for the Hough Transform
  3. Find the Field Horizon and Convex Hull of the field by looking for GREEN pixels
  4. Perform the Hough Transform to find lines limiting the search to the area under the convex hull
  5. Intersect the lines to find field corners
  6. Scan up vertically from the convex hull to look for Goal Posts
  7. Scan down vertically from the convex hull to find The Ball, Field Crosses, and Robots (and Other Vision Stuff) - note: plan to switch to horizontal scan
  8. Perform Object Recognition
  9. Determine localization information for each object (distance, bearing, etc.)
    Distance - The distance in cm to the vision object. Bearing - The relative angle in radians to the vision object. (NEED TO KNOW SIGN)
  10. If we have not seen a ball in the top image we then scan the bottom image to find The Ball
  11. If we have seen a goal and it is cut off by the bottom of the screen in the top camera, we reset the goal information and instead look for it in the bottom camera (we do not do this if we see a 2nd goal). The reason is that by seeing the bottom of the goal we can get a reasonable distance estimate using pix estimate distances.

Limitations and ideas for improvement

1 Instead of slow vertical scans, scan the field horizontally. We can also take advantage of the fact that items nearer to us are larger (and therefore our scan resolution can be coarser).
2 We only look for robots below the convex hull. Looking for white, rather than bands may work better.
3 We do not see partially occluded goal posts. In conjunction with field lines we could overcome this limitation for many situations.
4 Color Segmentation, while useful, has lots of problems - it is imprecise, takes laborious calibration, etc. We are trying to move increasingly to edge-based approaches
5 We are purely sensor driven. Which is to say we don't use any memory or any knowledge to help. This has many advantages, but ultimately it slows things down and makes us less precise than we could be.
6 Our distance estimates are either based on size in the image (e.g. the width of a post) or on pose information (the position and geometry of the camera). Both have their problems. Pose works better, but small errors in pose lead to increasingly large errors as the distance gets further. We can help by using the nearest objects to estimate distance to further objects.
7 We should switch over entirely to a resolution-pyramid scheme.