Object Detection Methods - OSU-AIMS/tic-tac-toe GitHub Wiki

Contour Detection:

  • Works well if the objects are distinct shapes
  • Common method of object detection in static images.

Pick & Place of Rectangle Box:

  • Used to detect the location of the rectangular box to obtain center and orientation
  • Eventually adapted into the Tic-Tac-Toe project

Tic-Tac-Toe:

  • Initially used contour detect for board orientation
  • Worked well but had trouble with orientation --> moved to using MatchTemplate()

Complications with Contour Detection:

  • If object is square, then rotating it by 45 degrees will mess with orientation
  • Obscuring parts of the contour slightly will mess with the detection

Documentation for Contour Detection:

References:

MatchTemplate():

Based on Github issue: RGB Image Kernel-Based Board Pose Detection

Note for use:

  • MatchTemplate works well for objects whose size, orientation, and camera focal length does not change.
  • Static Images/Environments = Good Detection
  • MatchTemplate extremely sensitive to rotation, viewing angle, and scale.
  • Dynamic Environments = Not so good detection without other additions to make it more robus
  • Input image must match the template image. So, if input image is RGB while template image is Grayscale, matchTemplate() won't work

Computer Environment:

  • Ubuntu 18.04
  • Python 2.7
  • OpenCV: 4.2.0

Important Documents:

I used MatchTemplate by using an image as the kernel. But I was not able to use an array as a kernel.


From imread() docs: In the case of color images, the decoded images will have the channels stored in B G R order. By default, the number of pixels must be less than 2^30. Limit can be set using system variable OPENCV_IO_MAX_IMAGE_PIXELS

Process for using matchTemplate() with a STATIC image:

  • Read image using cv2.imread()
  • take image into matchtemplate() function
  • then matchtemplate() outputs the heatmap Code:
# Read Image into script
image = cv2.imread("tic_tac_toe_images/twistCorrectedColoredSquares_Color.tiff")

# Creating Kernels from cropped images
CWD = dirname(abspath(__file__)) # Control Working Directory - goes to script location
RESOURCES = join(CWD,'tic_tac_toe_images') # combine script location with folder name
# blue_square = 'blue_square_crop.tiff'  - Used for Static Image and other images at the same depth & focal Length
blue_square = 'in-lab_straight_blue_square_crop.tiff'
kernel_b = cv2.imread(join(RESOURCES,blue_square)) # combine folder name with picture name inside folder

# MatchTemplate()
res_B = cv2.matchTemplate(image=image,templ=kernel_b,method=5)
cv2.imwrite('res_match_template_B.tiff',res_B)
min_val_B, max_val_B, min_loc_B, max_loc_B = cv2.minMaxLoc(res_B)
print('min_loc_B')
print(min_loc_B)
print('max_loc_B')
print(max_loc_B)

# Drawing Bounding Box around detected shape
# determine the starting and ending (x, y)-coordinates of the bounding box
# From: https://www.pyimagesearch.com/2021/03/22/opencv-template-matching-cv2-matchtemplate/
(startX_B, startY_B) = max_loc_B
endX_B = startX_B + kernel_b.shape[1]
endY_B = startY_B + kernel_b.shape[0]

# draw the bounding box on the image (same process for green & red boxes)
b_box_image = cv2.rectangle(image, (startX_B, startY_B), (endX_B, endY_B), (255, 0, 0), 4) # BGR for openCV
# show the output image
# cv2.imshow("Output based on matchTemplate", b_box_image)
cv2.imwrite('res_match_template_Blue_BoundingBox.tiff', b_box_image)
'''
    cv::TemplateMatchModes 
    cv::TM_SQDIFF = 0,
    cv::TM_SQDIFF_NORMED = 1,
    cv::TM_CCORR = 2,
    cv::TM_CCORR_NORMED = 3,
    cv::TM_CCOEFF = 4,
    cv::TM_CCOEFF_NORMED = 5
'''
  • Same procedure used to detect Red & Green Squares. Just change the directory to use the red or green square image as the kernel
  • From what I can gather, it looks like the methods have a varying degree of sensitivity and both method 4 & 5 work in the straight and angled orientations
  • The Max & Min loc returned by the function are the top right & bottom left corners of the bounding box respectively --> use these to get the center of the bounding box and get orientation of the entire board

Process of using MatchTemplate() with camera feed using ROS:

  • For Tic-Tac-Toe, created a Node because we needed to output the centers for each of the squares to draw the x-y axis and get the orientation
print("Your OpenCV version is: " + cv2.__version__)  

# Initialize a Node:
rospy.init_node('Colored_Square_Detect', anonymous=False)
rospy.loginfo(">> Colored Square Detect Node Successfully Created")

# Setup Publishers
pub_center = rospy.Publisher("ttt_board_origin", TransformStamped, queue_size=20)
    
# Setup Listeners
tfBuffer = tf2_ros.Buffer()
listener = tf2_ros.TransformListener(tfBuffer)

# create subscriber to ros Image Topic - pulled from kernel_color_detect
image_sub = rospy.Subscriber("/camera/color/image_raw", Image, runner)
  • Ran the launch file for the camera (t2_vision) along with the script for Colored Square Detect

Making Template Matching More Robust:

  • Template matching is sensitive to changes in lighting, scale, rotation, etc which isn’t ideal for object detection in dynamic environments
  • Various methods to making Template Matching more robust to fluctuations in the environments

SIFT:

  • Patented - not free for commercial use
  • Scale Invariant Feature Transform - Feature Detection in Computer Vision
  • Locates Key Points of the image to be used as features for image during model training → Key Points aren’t affected by size/orientation of image

Major advantages of SIFT are

  • Locality: features are local, so robust to occlusion and clutter (no prior segmentation)
  • Distinctiveness: individual features can be matched to a large database of objects
  • Quantity: many features can be generated for even small objects
  • Efficiency: close to real-time performance
  • Extensibility: can easily be extended to a wide range of different feature types, with each adding robustness

**Example: ** from https://pysource.com/2018/03/21/feature-detection-sift-surf-obr-opencv-3-4-with-python-3-tutorial-25/

import cv2
import numpy as np
img = cv2.imread("the_book_thief.jpg", cv2.IMREAD_GRAYSCALE)

surf = cv2.xfeatures2d.SURF_create()

keypoints_surf, descriptors = surf.detectAndCompute(img, None)

img = cv2.drawKeypoints(img, keypoints, None)
cv2.imshow("Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

References for SIFT:

SURF:

  • Patented - not free for commercial use
  • Stands for Speeded Up Robust Features
  • 2 Steps: Feature Extraction & Feature Description:

Feature Extraction:

  • Uses a very basic Hessian matrix approximation.
  • Integral Images: way of calculating the sum of values (pixel values) in a given image — or a rectangular subset of a grid (the given image)
  • Uses the Hessian matrix because of its good performance in computation time and accuracy.
  • Goal: fixing a reproducible orientation based on information from a circular region around the key point

Feature Description:

  • Construct a square region aligned to the selected orientation and extract the SURF descriptor from it
  • In order to be invariant to rotation, surf tries to identify a reproducible orientation for the interest points Example:
First we import the libraries and load the image:`
import cv2`
import numpy as np`
img = cv2.imread("the_book_thief.jpg", cv2.IMREAD_GRAYSCALE)`

We then load one by one the three algorithms.`
sift = cv2.xfeatures2d.SIFT_create()`

keypoints_sift, descriptors = sift.detectAndCompute(img, None)`

img = cv2.drawKeypoints(img, keypoints, None)`
cv2.imshow("Image", img)`
cv2.waitKey(0)`
cv2.destroyAllWindows()`

References for SURF:

ORB:

  • Combination of FAST Key point detector & BRIEF descriptor to improve performance
  • FAST: Features from Accelerated Segment Test: detect features from provided image
  • Brute Force Matcher:takes the descriptor of one feature in first set and is matched with all other features in second set using some distance calculation. And the closest one is returned.
  • Basic Algorithm:
  • Take the query image and convert it to grayscale.
  • Now Initialize the ORB detector and detect the keypoints in query image and scene.
  • Compute the descriptors belonging to both the images.
  • Match the key points using Brute Force Matcher.
  • Show the matched images.

References for ORB:

Supplemental resources:

  • 2015 Paper using 1D-template matching algorithm (https://www.scirp.org/pdf/jsip_2015040214202818.pdf) Results: Experimental results show that the computational time of the proposed approach is faster and performance is better than three basic template matching methods. Moreover, our approach is robust to detect the target object with changes of illumination in the template also when the Gaussian noise added to the source image.
  • 2021 Paper using two-stage & dual-check bounded partial correlation: results show that the TDBPC algorithm proposed in this paper can solve high computational complexity and long matching time of NCC template matching and make it possible to achieve real-time template matching in industrial vision positioning fields https://link.springer.com/article/10.1007/s10044-021-00997-7