Target Tracking in Image Coordinate System - ZhiangChen/target_mapping GitHub Wiki
1. Kalman Filtering
I use Kalman filtering to fuse bounding box coordinates from a neural network detection and a KLT tracker. The state variable is the bounding box x=(x_min, y_min, 1, x_max, y_max, 1) in homogeneous coordinate system, or u=(x_min, y_min, x_max, y_max) in Euclidean coordinate system. I consider the KLT tracker as a motion model.
x=Ax + e
where A = [P, 0], 0, P, P is an affine transformation from the KLT tracking features, e is a Gaussian noise vector. Object detection from a deep neural network is used as an observation model to update the state variable.
z=Cx + w
where C is an identity matrix, and w is a Gaussian noise vector for the observation model.
Algorithm:
(1) Motion
x(t) = A(t) x(t-1)
COv(t) = A(t) COv(t-1) A(t)^T + R
e ~ N(0,R)
(2) Observation
K = Cov(t) (Cov(t) + Q)^(-1)
u(t) = u(t) + K[z(t)-u(t)]
Cov(t) = (I - K)Cov(t)
w ~ N(0, Q)
In the motion model, I use homogeneous coordinate system because A is an affine transformation in 2D homogeneous coordinate system. The Euclidean coordinate system and homogeneous system can be converted by a linear operation:
T_4x6 = [ [1, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0], [0, 0, 0, 0, 1, 0] ]
u = Tx
Cov = T COv
Euclidean coordinate system rather than homogeneous coordinate system is used in the observation model because the covariance matrix in homogeneous coordinate system is degenerate. In order to update Kalman gain K, the covariance needs to be invertible.
2. Registration
(1). Bounding box registration
A new bounding box will be registered if its IoU (intersection over union) with any existing bounding boxes is smaller than a threshold.
(2). Bounding box deregistration
A bounding box will be deregistered when the H matrix (in KLT tracker) becomes singular, which implies the target in the bounding box is moving out of the current frame.
False-positive detection is alleviated by measuring the differential entropy of the bounding box distribution.