Mean Shift Tracking
In this article we will look at the application of Mean Shift Tracking for color based tracking.
Introduction
In the article we will look at the application of Mean Shift Tracking for color based tracking.
Mean shift
The object model used in mean shift tracking is color probability distribution.
Now we have a object model,given an image we can compute the likelihood image Each pixel in likelihood image represents the likelihood that pixel belongs to the object model/histogram.



It is reasonable to assume that the region in which the highest similarity measure or highest density is observed is a good estimate of object location.
//building the object appearance model
void ocvmeanShift::buildModel(Mat image,Rect rect)
{
//input region of interest
region=rect;
//center of region is current location estimate
p.x=region.x+region.width/2;
p.y=region.y+region.height/2;
//extract ROI
Mat roi=image(rect);
//compute the histogram ,h is object of type Histogram
h.BuildHistogram(roi);
}
//call to compute the likelihood after computing the model
Mat sim=h.likeyhoodImage(image);
Thus if we consider a small window and move towards the mean value ie along the mean shift vector we should eventually reach the region of maximum similarity.
The likelihood surface is not smooth, we can give it properties of smoothness using kernel density estimation.
Now we can find the modes of the similarity surface using standard mean shift algorithm.
Let us assume that current estimate of mode of function is at $y$. Thus we consider a small rectangular window about $y$,compute the mean shift vector and take a small step along the mean shift vector.
In principle this should enable to find the local maximum. Similarity surface is discontinuous and to do this we need to perform KDE over entire image consisting of dense grid of points ,which is a expensive operation.(We need to perform convolution at each point with Gaussian with suitably large aperture)
Using Similarity for tracking
The concept of similarity surface can be made useful in tracking application.Let us consider a small region of interest about a present location y, we can compute the similarity score about this region,perform KDE on this small region,obtain a similarity surface and compute the mean shift vector.
If object is not present in region,similarity surface will be flat and mean shift vector will be zero.
If there is object present in some part of region,it will correspond to modes of similarity surface .The mean shift vector will give us direction to move along.
Now instead of trying to estimate the mode,say we translate the region of interest along direction provided by the mean shift vector. This would typically lead to large portion of object being visible and would expose the region of global similarity surface where a large maximum would lie.
This is the basis of mean shift tracking, keen on translating the region of interest ,till we reach local maximum of similarity surface.
For tracking applications ,since fast computation is required, we can consider a rectangular window with bandwidth equal to that of the region of interest.The present location of point point is the center of the rectangular region.
Implementation
//compute the likelihood
Mat sim=h.likeyhoodImage(image);
//perform iteratively till convergence
for(int i=0;i<criteria.maxCount;i++)
{
//extracting the region of interest
Mat roi=sim(region);
//compute the moments
cv::Moments m;
m=cv::moments(roi,false);
//threshold m00 which is weighted mean,
//exit since no similar pixels present
if(fabs(m.m00)<region.width*region.height*0.05)
break;
//computing the mean values
int x=cvRound(m.m10/m.m00);
int y=cvRound(m.m01/m.m00);
//computing the mean shift
int dx=region.width/2-x;
int dy=region.height/2-y;
//displacement from current position
int nx=p.x-dx;
int ny=p.y-dy;
//bounday of the image
if(nx-region.width/2<=0) nx=region.width/2;
if(nx+region.width/2>=image.cols) nx=image.cols-region.width/2-1;
if(ny-region.height/2<=0) ny=region.height/2;
if(ny+region.height/2>=image.rows) ny=image.rows-region.height/2-1;
//recalculating the mean shift
dx=-nx+p.x;
dy=-ny+p.y;
//checking magnitude of mean shift vector.
float mag=dx*dx+dy*dy;
//no change in mean,reached local maxima
if(mag<criteria.epsilon*criteria.epsilon)
break;
//updating the position
p.x=nx;
p.y=ny;
//updating the region of interest
region.x=p.x-region.width/2;
region.y=p.y-region.height/2;
}
Another issue is that if the object is moving too fast and significant part of the
object moves out of ROI in successive frames,the object will not be tracked. This
can be seen in the following videos
If we encounter a larger object or object that exhibits higher density ,tracking
will be lost.In the below case the tracking is lost when object passes over a large
blue background which is similar to object color.
There are many other cases where mean shift tracking will fail
As will all tracking approaches ,the performance heavily depends on the object model. The better we are able to model the object and obtain a likelyhood/similarity which does not show high probability for background or other objects in the scene,the more accurate will be the tracking
Code
For further image processing application a library consisting of high level interface to opencv will be used.The library is called OpenVisionLibrary. https://github.com/pi19404/OpenVisionThe project cmake file is included in the repository. the build will create the library and test files in the bin directory To run demo program for mean shift run the binary meanShiftTest
The files for mean shift algorithm are meanshift.cpp and meanshift.hpp https://github.com/pi19404/OpenVision/tree/master/ImgProc/repository.
To run the test program :
meanShift - to run using camera input
meanShiftTest {video file name} - to run using a video file
For video file initially only the first frame is show ,select the ROI in the first
frame and then click on build model button to start the tracking.
The button is shown upon clicking on Display properties button on the window.